enhance custom easyblock for GCC to use `with-arch` option for nvptx with 13.1+ by Thyre · Pull Request #3396 · easybuilders/easybuild-easyblocks

Thyre · 2024-07-26T17:30:21Z

Motivation

GCC, like other compilers, allows users to use offloading via OpenMP & OpenACC for example to utilize accelerators in their written programs. While some compilers require the presence of CUDA for this e.g. Clang, GCC has no requirement for it to simply build and run an executable containing offloading code.

By default, GCC targets a very low architecture for NVIDIA GPUs though. In GCC 12.3.0, this was sm_30. In GCC 13.3.0, the default version is still the same, but recent nvptx-tools can bump this to sm_50 when CUDA is detected. With this, GCC can work around the removal of sm_3x in more recent CUDA versions, avoiding the following error message:

GCC 12.3.0

$ gcc -fopenmp -foffload=nvptx-none test.c
ptxas fatal   : Value 'sm_35' is not defined for option 'gpu-name'
nvptx-as: ptxas returned 255 exit status
mkoffload: fatal error: x86_64-pc-linux-gnu-accel-nvptx-none-gcc returned 1 exit status
compilation terminated.
lto-wrapper: fatal error: /p/software/fs/jurecadc/stages/2024/software/GCCcore/12.3.0/bin/../libexec/gcc/x86_64-pc-linux-gnu/12.3.0//accel/nvptx-none/mkoffload returned 1 exit status
compilation terminated.
/p/software/jurecadc/stages/2024/software/binutils/2.40-GCCcore-12.3.0/bin/ld: error: lto-wrapper failed
collect2: error: ld returned 1 exit status

GCC 13.3.0

$ gcc --verbose -fopenmp -foffload=nvptx-none test.c
[...]
/p/software/fs/jurecadc/stages/2025/software/GCCcore/13.3.0/bin/../libexec/gcc/x86_64-pc-linux-gnu/13.3.0/accel/nvptx-none/lto1 -quiet -dumpbase ./a.xnvptx-none.mkoffload -m64 -mgomp -misa=sm_30 -version -fno-openacc -fno-pie -fcf-protection=none -foffload-abi=lp64 -fopenmp @/tmp/ccZrKG87 -o /tmp/ccbOTp8K.s
[...]
Verifying sm_30 code with sm_50 code generation.
 ptxas -c -o /dev/null /tmp/cc7PheNR.o --gpu-name sm_50 -O0
[...]

However, this may break once again as soon as NVIDIA decides to remove the already deprecated support for sm_50 (in CUDA 11.0). Fortunately, GCC has added a configure option to overwrite the default nvptx architecture. Beginning with GCC 13.1.0, one can pass --with-arch=sm_[x] to set the default option, as long as GCC can understand it.

In addition, choosing a newer architecture by default might bring performance improvements and access to additional features.

Scope of this PR

This pull request adds the new option --with-arch=sm_[x] to GCC builds starting with GCC 13.1.0 if offloading support via nvptx is enabled. To choose which architecture is being passed, a new function named map_nvptx_capability is implemented. This function retrieves cuda_compute_capabilities and matches them against the official GCC mappings (which can be found in ${GCC_SRC}/gcc/config/nvptx/nvptx.opt) being used for the -march-map= argument.

Since GCC only allows to set a single default architecture, I decided to use the lowest one available. For example, JURECA-DC sets both 7.5 and 8.0 for EasyBuild. Therefore, 7.5 would be chosen.
If parsing the architecture mappings fails, for example because the file layout changed or the file was moved, a warning is returned. In this case, we stick to the default of GCC. This is also the case if the architectures in cuda_compute_capabilities cannot be mapped at all. This makes the additions more resilient to upstream changes.

Generally, this helps users as they are not required to pass architectures manually every single time as it is the case with CUDA 12 + GCC 12.3.0 right now. Here, one would need to pass -foffload-options=-misa=sm_80.

SebastianAchilles · 2024-07-30T12:50:48Z

Test report by @SebastianAchilles

Overview of tested easyconfigs (in order)

SUCCESS GCCcore-12.3.0.eb
SUCCESS GCCcore-13.1.0.eb
SUCCESS GCCcore-13.2.0.eb
SUCCESS GCCcore-13.3.0.eb
SUCCESS GCCcore-14.1.0.eb

Build succeeded for 5 out of 5 (5 easyconfigs in total)
jscclxc1.int.jsc-clx.fz-juelich.de - Linux Rocky Linux 9.4, x86_64, Intel Xeon Processor (Cascadelake) (cascadelake), Python 3.9.18
See https://gist.github.com/SebastianAchilles/6890d9cc1f0024fe7541e064ba5009f8 for a full test report.

Thyre · 2024-08-01T07:26:27Z

Thanks a lot for the review. I agree with your comments and am working on adding them to the PR.

Thyre · 2024-08-14T08:32:58Z

Fixed the failed test workflow: https://github.com/easybuilders/easybuild-easyblocks/actions/runs/10196832799
I missed one f-string.

SebastianAchilles · 2024-08-14T17:25:36Z

Test report by @SebastianAchilles

Overview of tested easyconfigs (in order)

SUCCESS GCCcore-12.3.0.eb
SUCCESS GCCcore-13.1.0.eb
SUCCESS GCCcore-13.2.0.eb
SUCCESS GCCcore-13.3.0.eb
SUCCESS GCCcore-14.1.0.eb

Build succeeded for 5 out of 5 (5 easyconfigs in total)
jscclxc1.int.jsc-clx.fz-juelich.de - Linux Rocky Linux 9.4, x86_64, Intel Xeon Processor (Cascadelake) (cascadelake), Python 3.9.18
See https://gist.github.com/SebastianAchilles/2b79825a144baa828025421933035a68 for a full test report.

Signed-off-by: Jan André Reuter <[email protected]>

boegel · 2024-09-10T18:59:05Z

@boegelbot please test @ jsc-zen3
EB_ARGS="GCCcore-10.2.0.eb GCCcore-12.3.0.eb GCCcore-14.2.0.eb --installpath /tmp/$USER/pr-3396"

boegelbot · 2024-09-10T19:13:08Z

@boegel: Request for testing this PR well received on jsczen3l1.int.jsc-zen3.fz-juelich.de

PR test command 'if [[ develop != 'develop' ]]; then EB_BRANCH=develop ./easybuild_develop.sh 2> /dev/null 1>&2; EB_PREFIX=/home/boegelbot/easybuild/develop source init_env_easybuild_develop.sh; fi; EB_PR=3396 EB_ARGS="GCCcore-10.2.0.eb GCCcore-12.3.0.eb GCCcore-14.2.0.eb --installpath /tmp/$USER/pr-3396" EB_REPO=easybuild-easyblocks EB_BRANCH=develop /opt/software/slurm/bin/sbatch --job-name test_PR_3396 --ntasks=8 ~/boegelbot/eb_from_pr_upload_jsc-zen3.sh' executed!

exit code: 0
output:

Submitted batch job 4839

Test results coming soon (I hope)...

Details

- notification for comment with ID 2341789066 processed

Message to humans: this is just bookkeeping information for me,
it is of no use to you (unless you think I have a bug, which I don't).

boegel · 2024-09-10T21:02:57Z

Test report by @boegel

Overview of tested easyconfigs (in order)

SUCCESS GCCcore-12.3.0.eb
SUCCESS GCCcore-13.2.0.eb
SUCCESS zlib-1.3.1.eb
SUCCESS binutils-2.42.eb
SUCCESS GCCcore-14.2.0.eb

Build succeeded for 5 out of 5 (3 easyconfigs in total)
node3900.accelgor.os - Linux RHEL 8.8, x86_64, AMD EPYC 7413 24-Core Processor, 1 x NVIDIA NVIDIA A100-SXM4-80GB, 545.23.08, Python 3.6.8
See https://gist.github.com/boegel/1dcd0f4c7656396fcdacc42dfa4f04f7 for a full test report.

boegelbot · 2024-09-10T21:41:19Z

Test report by @boegelbot

Overview of tested easyconfigs (in order)

SUCCESS GCCcore-10.2.0.eb
SUCCESS GCCcore-12.3.0.eb
SUCCESS GCCcore-14.2.0.eb

Build succeeded for 3 out of 3 (3 easyconfigs in total)
jsczen3c1.int.jsc-zen3.fz-juelich.de - Linux Rocky Linux 9.4, x86_64, AMD EPYC-Milan Processor (zen3), Python 3.9.18
See https://gist.github.com/boegelbot/c33373ba82e48b22ebec6b3a5aa2dc71 for a full test report.

Thyre force-pushed the gcc-use-witharch-build-option branch from c4c4e22 to b7308d2 Compare July 26, 2024 20:05

SebastianAchilles added this to the release after 4.9.2 milestone Jul 30, 2024

SebastianAchilles added the enhancement label Jul 30, 2024

boegel requested changes Jul 31, 2024

View reviewed changes

boegel changed the title ~~GCC: Use with-arch option for nvptx with 13.1+~~ enhance custom easyblock for GCC to use with-arch option for nvptx with 13.1+ Jul 31, 2024

Thyre force-pushed the gcc-use-witharch-build-option branch 2 times, most recently from 2756041 to 000c666 Compare August 1, 2024 10:44

Thyre force-pushed the gcc-use-witharch-build-option branch from 000c666 to 5a86bb8 Compare August 14, 2024 07:51

Thyre requested a review from boegel August 14, 2024 08:33

boegel requested changes Aug 28, 2024

View reviewed changes

Comment thread easybuild/easyblocks/g/gcc.py

GCC: Use with-arch option for nvptx with 13.1+

bce12fe

Signed-off-by: Jan André Reuter <[email protected]>

Thyre force-pushed the gcc-use-witharch-build-option branch from 5a86bb8 to bce12fe Compare August 28, 2024 14:35

Thyre requested a review from boegel August 28, 2024 14:49

boegel approved these changes Sep 10, 2024

View reviewed changes

boegel merged commit 485a195 into easybuilders:develop Sep 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

enhance custom easyblock for GCC to use `with-arch` option for nvptx with 13.1+#3396

enhance custom easyblock for GCC to use `with-arch` option for nvptx with 13.1+#3396
boegel merged 1 commit intoeasybuilders:developfrom
Thyre:gcc-use-witharch-build-option

Thyre commented Jul 26, 2024

Uh oh!

SebastianAchilles commented Jul 30, 2024

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Thyre commented Aug 1, 2024

Uh oh!

Thyre commented Aug 14, 2024

Uh oh!

SebastianAchilles commented Aug 14, 2024

Uh oh!

Uh oh!

boegel commented Sep 10, 2024

Uh oh!

boegelbot commented Sep 10, 2024

Uh oh!

boegel commented Sep 10, 2024

Uh oh!

boegelbot commented Sep 10, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

Thyre commented Jul 26, 2024

Motivation

Scope of this PR

Uh oh!

SebastianAchilles commented Jul 30, 2024

Overview of tested easyconfigs (in order)

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Thyre commented Aug 1, 2024

Uh oh!

Thyre commented Aug 14, 2024

Uh oh!

SebastianAchilles commented Aug 14, 2024

Overview of tested easyconfigs (in order)

Uh oh!

Uh oh!

boegel commented Sep 10, 2024

Uh oh!

boegelbot commented Sep 10, 2024

Uh oh!

boegel commented Sep 10, 2024

Overview of tested easyconfigs (in order)

Uh oh!

boegelbot commented Sep 10, 2024

Overview of tested easyconfigs (in order)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants