You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
boegel
changed the title
Add upstream GCC patch to avoid spurious FPE on avx512 (affects UCX)
add upstream patch for GCC 9.x, 10.x, 11.x to avoid spurious FPE on avx512 (affects UCX)
Aug 5, 2021
Test report by @bartoldeman SUCCESS
Build succeeded for 1 out of 1 (1 easyconfigs in total)
build-node.computecanada.ca - Linux centos linux 7.9.2009, x86_64, Intel Xeon Processor (Skylake, IBRS), Python 3.7.7
See https://gist.github.com/15d8aa863dd891939d63b68fe6a65956 for a full test report.
program main
use mpi
implicit none
integer ierr,iproc,imol
call MPI_INIT(ierr)
call MPI_COMM_RANK(MPI_COMM_WORLD,iproc,ierr)
write(*,*) 'iproc', iproc
if (iproc == 0 ) then
imol = 1
end if
call MPI_BCAST(imol, 1, MPI_INTEGER, 0, MPI_COMM_WORLD, ierr)
write(*,*) 'iproc,imol',iproc,imol
call MPI_FINALIZE(ierr)
end
compile with (foss toolchain), not no arch option necessary here, issue is not in the example: mpifort -ffpe-trap=invalid mpibcast.f90 -o mpibcast
run with srun --nodes=2 --ntasks-per-node=4 --time=0:05:00 ./mpibcast
should get you something like:
iproc 1
iproc 5
iproc 0
iproc 2
iproc 3
[blg8429:99624:0:99624] Caught signal 8 (Floating point exception: floating-point invalid operation)
[blg8429:99625:0:99625] Caught signal 8 (Floating point exception: floating-point invalid operation)
[blg8429:99626:0:99626] Caught signal 8 (Floating point exception: floating-point invalid operation)
[blg8429:99627:0:99627] Caught signal 8 (Floating point exception: floating-point invalid operation)
iproc 4
iproc 6
iproc 7
==== backtrace (tid: 99624) ====
0 0x000000000002078e ucs_debug_print_backtrace() /tmp/ebuser/avx512/UCX/1.8.0/GCCcore-9.3.0/ucx-1.8.0/src/ucs/debug/debug.c:653
1 0x00000000000130f0 __funlockfile() :0
2 0x000000000001ba9b ucp_ep_config_get_zcopy_auto_thresh() /tmp/ebuser/avx512/UCX/1.8.0/GCCcore-9.3.0/ucx-1.8.0/src/ucp/core/ucp_ep.c:1953
...
Test report by @branfosj SUCCESS
Build succeeded for 9 out of 9 (9 easyconfigs in total)
bear-pg0211u03a.bear.cluster - Linux RHEL 8.3, x86_64, Intel(R) Xeon(R) Gold 6248 CPU @ 2.50GHz (cascadelake), Python 3.6.8
See https://gist.github.com/9be97944d126c9e88a6262110468c0e4 for a full test report.
Test report by @Micket SUCCESS
Build succeeded for 11 out of 11 (9 easyconfigs in total)
alvis-c1 - Linux centos linux 7.9.2009, x86_64, Intel Xeon Processor (Skylake), Python 3.6.8
See https://gist.github.com/1ae2d598201ec82f9c8d22d8a91392ad for a full test report.
Test report by @boegel SUCCESS
Build succeeded for 9 out of 9 (9 easyconfigs in total)
node2625.swalot.os - Linux centos linux 7.9.2009, x86_64, Intel(R) Xeon(R) CPU E5-2660 v3 @ 2.60GHz (haswell), Python 3.6.8
See https://gist.github.com/03c60a468a51604ae12e2ec61472d630 for a full test report.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
(created using
eb --new-pr)