Skip to content

restore RPATH wrappers for OpenMPI sanity check#2582

Merged
bartoldeman merged 2 commits intoeasybuilders:developfrom
boegel:openmpi_sanity_check_rpath
Oct 21, 2021
Merged

restore RPATH wrappers for OpenMPI sanity check#2582
bartoldeman merged 2 commits intoeasybuilders:developfrom
boegel:openmpi_sanity_check_rpath

Conversation

@boegel
Copy link
Copy Markdown
Member

@boegel boegel commented Sep 24, 2021

WIP because I need to check whether this fixes a problem reported in Slack, after I reproduce the issue first...

@boegel boegel added the bug fix label Sep 24, 2021
@boegel boegel added this to the next release (4.5.0?) milestone Sep 24, 2021
@stderr-enst
Copy link
Copy Markdown

stderr-enst commented Sep 24, 2021

@boegel thanks for the PR!
Going over OpenMPI in foss-2019b with --module-only --include-easyblocks-from-pr 2582 seemed to work in my debug build (designated local buildpath, --debug flag and set tmp dirs). To double check I also ran the build process in foss-2021a without --module-only, building it in a "production setting" and this is giving me the same error as before:

FAILED: Installation ended unsuccessfully (build directory: /tmp/easybuild/rome/OpenMPI/4.1.1/GCC-10.3.0): build failed (first 300 chars): Sanity check failed: sanity check command mpirun -n 8 /tmp/easybuild/rome/OpenMPI/4.1.1/GCC-10.3.0/mpi_test_hello_mpifh exited with code 127 (output: /tmp/easybuild/rome/OpenMPI/4.1.1/GCC-10.3.0/mpi_test_hello_mpifh: error while loading shared libraries: libgfortran.so.5: cannot open shared object f...

I'm a bit confused by this and think the only relevant difference should be running with/without --module-only.
Could this be affected by possibly left-over files from a previous attempt to build OpenMPI in 2021a?

In any case, I'm trying to apply this again in a clean build with 2021a and get back to you. Just wanted to give an update on this

@stderr-enst
Copy link
Copy Markdown

stderr-enst commented Sep 27, 2021

Building OpenMPI again in 2021a still fails. It also fails in 2019b with --rebuild. Running again with --module-only with the eb from this PR, seems to make the build happy, but my feeling is that the part resulting in the error is skipped (although the sanity check step is run). Not sure why module-only makes a difference.

@boegel
Copy link
Copy Markdown
Member Author

boegel commented Sep 28, 2021

Building OpenMPI again in 2021a still fails. It also fails in 2019b with --rebuild. Running again with --module-only with the eb from this PR, seems to make the build happy, but my feeling is that the part resulting in the error is skipped (although the sanity check step is run). Not sure why module-only makes a difference.

Indeed, the example progress are not run during the sanity check under --module-only, because the example programs are compiled from the unpacked source directory (which is not there when using --module-only). That's a bug in how the sanity check is done, it shouldn't make a difference.

@boegel boegel added the EESSI Related to EESSI project label Oct 21, 2021
Comment thread easybuild/easyblocks/o/openmpi.py Outdated
@boegel boegel changed the title restore RPATH wrappers for OpenMPI sanity check (WIP) restore RPATH wrappers for OpenMPI sanity check Oct 21, 2021
@boegel
Copy link
Copy Markdown
Member Author

boegel commented Oct 21, 2021

Test report by @boegel

Overview of tested easyconfigs (in order)

  • SUCCESS OpenMPI-4.0.3-GCC-9.3.0.eb

Build succeeded for 1 out of 1 (1 easyconfigs in total)
login1 - Linux UNKNOWN UNKNOWN, x86_64, Intel(R) Xeon(R) CPU E5-2690 v3 @ 2.60GHz (haswell), Python 3.9.5
See https://gist.github.com/1eccc898554fcc6badfb7be6fbf95de9 for a full test report.

edit: this was tested in the EESSI environment, where --rpath is enabled

Copy link
Copy Markdown
Contributor

@bartoldeman bartoldeman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm

@bartoldeman bartoldeman merged commit 16b1525 into easybuilders:develop Oct 21, 2021
@boegel boegel deleted the openmpi_sanity_check_rpath branch October 22, 2021 06:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug fix EESSI Related to EESSI project

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants