enhance custom easyblock for GCC to use with-arch option for nvptx with 13.1+#3396
Conversation
c4c4e22 to
b7308d2
Compare
|
Test report by @SebastianAchilles Overview of tested easyconfigs (in order)
Build succeeded for 5 out of 5 (5 easyconfigs in total) |
with-arch option for nvptx with 13.1+
|
Thanks a lot for the review. I agree with your comments and am working on adding them to the PR. |
2756041 to
000c666
Compare
000c666 to
5a86bb8
Compare
|
Fixed the failed test workflow: https://github.com/easybuilders/easybuild-easyblocks/actions/runs/10196832799 |
|
Test report by @SebastianAchilles Overview of tested easyconfigs (in order)
Build succeeded for 5 out of 5 (5 easyconfigs in total) |
Signed-off-by: Jan André Reuter <[email protected]>
5a86bb8 to
bce12fe
Compare
|
@boegelbot please test @ jsc-zen3 |
|
@boegel: Request for testing this PR well received on jsczen3l1.int.jsc-zen3.fz-juelich.de PR test command '
Test results coming soon (I hope)... Details- notification for comment with ID 2341789066 processed Message to humans: this is just bookkeeping information for me, |
|
Test report by @boegel Overview of tested easyconfigs (in order)
Build succeeded for 5 out of 5 (3 easyconfigs in total) |
|
Test report by @boegelbot Overview of tested easyconfigs (in order)
Build succeeded for 3 out of 3 (3 easyconfigs in total) |
Motivation
GCC, like other compilers, allows users to use offloading via OpenMP & OpenACC for example to utilize accelerators in their written programs. While some compilers require the presence of CUDA for this e.g. Clang, GCC has no requirement for it to simply build and run an executable containing offloading code.
By default, GCC targets a very low architecture for NVIDIA GPUs though. In GCC 12.3.0, this was
sm_30. In GCC 13.3.0, the default version is still the same, but recentnvptx-toolscan bump this tosm_50when CUDA is detected. With this, GCC can work around the removal ofsm_3xin more recent CUDA versions, avoiding the following error message:GCC 12.3.0
GCC 13.3.0
However, this may break once again as soon as NVIDIA decides to remove the already deprecated support for
sm_50(in CUDA 11.0). Fortunately, GCC has added a configure option to overwrite the default nvptx architecture. Beginning with GCC 13.1.0, one can pass--with-arch=sm_[x]to set the default option, as long as GCC can understand it.In addition, choosing a newer architecture by default might bring performance improvements and access to additional features.
Scope of this PR
This pull request adds the new option
--with-arch=sm_[x]to GCC builds starting with GCC 13.1.0 if offloading support via nvptx is enabled. To choose which architecture is being passed, a new function namedmap_nvptx_capabilityis implemented. This function retrievescuda_compute_capabilitiesand matches them against the official GCC mappings (which can be found in${GCC_SRC}/gcc/config/nvptx/nvptx.opt) being used for the-march-map=argument.Since GCC only allows to set a single default architecture, I decided to use the lowest one available. For example, JURECA-DC sets both 7.5 and 8.0 for EasyBuild. Therefore, 7.5 would be chosen.
If parsing the architecture mappings fails, for example because the file layout changed or the file was moved, a warning is returned. In this case, we stick to the default of GCC. This is also the case if the architectures in
cuda_compute_capabilitiescannot be mapped at all. This makes the additions more resilient to upstream changes.Generally, this helps users as they are not required to pass architectures manually every single time as it is the case with CUDA 12 + GCC 12.3.0 right now. Here, one would need to pass
-foffload-options=-misa=sm_80.