Skip to content

{chem}[foss,iomkl,intel/2018b] GPAW v1.4.0, GPAW-setups, libxc, libvdwxc#6984

Merged
migueldiascosta merged 6 commits intoeasybuilders:developfrom
schiotz:20181009155429_new_pr_GPAW140
Nov 26, 2018
Merged

{chem}[foss,iomkl,intel/2018b] GPAW v1.4.0, GPAW-setups, libxc, libvdwxc#6984
migueldiascosta merged 6 commits intoeasybuilders:developfrom
schiotz:20181009155429_new_pr_GPAW140

Conversation

@schiotz
Copy link
Copy Markdown
Contributor

@schiotz schiotz commented Oct 9, 2018

(created using eb --new-pr)

…s-0.8.7929.eb, GPAW-setups-0.9.11271.eb, GPAW-setups-0.9.20000.eb, GPAW-setups-0.9.9672.eb, libxc-3.0.1-foss-2018b.eb, libvdwxc-0.3.2-foss-2018b.eb and patches: GPAW-1.4.0-customize.patch, GPAW-1.4.0-silence-numpy-warning-8e072ac8.patch
@schiotz schiotz changed the title {chem}[dummy/dummy] GPAW v1.4.0, GPAW-setups v0.8.7929, GPAW-setups v0.9.11271, ... [WIP] {chem}[dummy/dummy] GPAW v1.4.0, GPAW-setups v0.8.7929, GPAW-setups v0.9.11271, ... Oct 9, 2018
@schiotz schiotz changed the title [WIP] {chem}[dummy/dummy] GPAW v1.4.0, GPAW-setups v0.8.7929, GPAW-setups v0.9.11271, ... [WIP] {chem}[foss,iomkl,intel] GPAW v1.4.0, GPAW-setups, libxc, libvdwxc ... Oct 9, 2018
@schiotz schiotz changed the title [WIP] {chem}[foss,iomkl,intel] GPAW v1.4.0, GPAW-setups, libxc, libvdwxc ... [WIP] {chem}[foss,iomkl,intel/2018b] GPAW v1.4.0, GPAW-setups, libxc, libvdwxc Oct 9, 2018
@schiotz
Copy link
Copy Markdown
Contributor Author

schiotz commented Oct 9, 2018

This is a new set of EasyConfigs for GPAW 1.4.0, based on the work by @Micket et al. in #6514

I have made the following changes compared to that PR:

  • Using libxc version 3.0.1 instead of 4.X as the latter triggers a bug. It is unclear if the bug is in libxc or in GPAW, but there is risk of getting wrong results (if the user generates PAW setups). There are also other strange symptoms from using libxc 4.X. See https://gitlab.com/gpaw/gpaw/issues/161

  • Moved from multithreaded libraries to single-threaded libraries, as GPAW handles parallelization explicitly with MPI, and using multithreaded libraries without setting the number of threads to one may cause CPU oversubscription.

  • The version using the foss toolchain is built with libvdwxc, but the ones with iomkl and intel toolchains are built without that library, since the test suite crashes. This is being investigated in https://gitlab.com/gpaw/gpaw/issues/163

  • A numpy 1.15.0 patch is applied, silencing a harmless but annoying warning.

  • The foss configuration patch is suitable for general toolchains, but the other two are specific.

@boegelbot
Copy link
Copy Markdown
Collaborator

Travis test report: 7/7 runs failed - see https://travis-ci.org/easybuilders/easybuild-easyconfigs/builds/439167261

Only showing partial log for 1st failed test suite run 11165.1;
full log at https://travis-ci.org/easybuilders/easybuild-easyconfigs/jobs/439167262

...
FAIL: Specific checks only done for the (easyconfig) files that were changed in a pull request.
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/travis/build/easybuilders/easybuild-easyconfigs/test/easyconfigs/easyconfigs.py", line 385, in test_changed_files_pull_request
    self.check_sha256_checksums(changed_ecs)
  File "/home/travis/build/easybuilders/easybuild-easyconfigs/test/easyconfigs/easyconfigs.py", line 351, in check_sha256_checksums
    self.assertTrue(len(checksum_issues) == 0, "No checksum issues:\n%s" % '\n'.join(checksum_issues))
AssertionError: No checksum issues:
Checksums missing for one or more sources/patches in GPAW-1.4.0-intel-2018b-Python-3.6.6.eb: found 1 sources + 2 patches vs 1 checksums

----------------------------------------------------------------------
Ran 9744 tests in 742.957s

FAILED (failures=1)
ERROR: Not all tests were successful.

(bleep, bloop, I'm just a bot, please talk to my owner @boegel if you notice you me acting stupid)

@Micket
Copy link
Copy Markdown
Contributor

Micket commented Oct 9, 2018

Just wanted to say thanks for picking up this PR, I have been kept very busy the past few weeks and just haven't had time to pick it up again.

@migueldiascosta
Copy link
Copy Markdown
Member

Test report by @migueldiascosta
SUCCESS
Build succeeded for 11 out of 11 (11 easyconfigs in this PR)
grc-cluster1 - Linux centos 6.10, Intel(R) Xeon(R) CPU E5-2640 0 @ 2.50GHz, Python 2.7.14
See https://gist.github.com/edb22c522aa14af8d6484962af756467 for a full test report.

@schiotz schiotz changed the title [WIP] {chem}[foss,iomkl,intel/2018b] GPAW v1.4.0, GPAW-setups, libxc, libvdwxc {chem}[foss,iomkl,intel/2018b] GPAW v1.4.0, GPAW-setups, libxc, libvdwxc Oct 11, 2018
@schiotz
Copy link
Copy Markdown
Contributor Author

schiotz commented Oct 11, 2018

@boegel
In my opinion, this PR is now ready for review. As mentioned above, GPAW is provided with libvdwxc support in the foss toolchain, but not in the iomkl and intel toolchains, due to incompatibility with the Intel MKL. This may be solved in future versions of GPAW. User who need libvdwxc (probably a minority) will have to live with the somewhat worse performance of the foss build.

Thanks to @Micket for laying the ground work for this in his PR.

CC: @mikstr @OleHolmNielsen

@schiotz
Copy link
Copy Markdown
Contributor Author

schiotz commented Oct 18, 2018

@boegel Is there any hope of getting this one into the next EasyBuild release?

@schiotz
Copy link
Copy Markdown
Contributor Author

schiotz commented Nov 22, 2018

@boegel It would be very valuable for us, if this PR could make it into the next EasyBuild release.

Best regards
Jakob

@migueldiascosta
Copy link
Copy Markdown
Member

Test report by @migueldiascosta
SUCCESS
Build succeeded for 11 out of 11 (11 easyconfigs in this PR)
grc-cluster1 - Linux centos 6.10, Intel(R) Xeon(R) CPU E5-2640 0 @ 2.50GHz, Python 2.7.14
See https://gist.github.com/81937eda0671e63f4e3bd727050765b0 for a full test report.

@migueldiascosta
Copy link
Copy Markdown
Member

for me, both serial and parallel tests pass for both intel and foss toolchains, as well as the serial tests for iomkl, but the parallel tests for iomkl often hang and/or fail, e.g. with

gpaw.PoissonConvergenceError: Poisson solver did not converge in 1000 iterations!

@schiotz
Copy link
Copy Markdown
Contributor Author

schiotz commented Nov 23, 2018

@migueldiascosta
Yes, we have also observed that the developer version of gpaw often crash in the extended self tests on our own cluster when built with the iomkl toolchain. There are different crashes, but often it is a core dump in the openmpi libraries. I am not 100% sure that the openmpi library is behaving well when compiled with the Intel compiler.

Obviously, it could also be some subtle bug in gpaw triggering this crash in openmpi, when built with the more agressively optimizing Intel compiler. We have just decided locally to only build gpaw with the intel and foss toolchains.

I think it would be wise to withdraw the easyconfigs for the iomkl toolchain, but will discuss it with my colleagues before doing it.

Thank you very much indeed for testing!

@schiotz
Copy link
Copy Markdown
Contributor Author

schiotz commented Nov 23, 2018

@boegel @migueldiascosta I have withdrawn the iomkl easyconfigs, as GPAW does not work reliably when compiled with that toolchain. If/when they find out why, I will add iomkl again for later releases of gpaw. But now that Intel MPI can be used for free, the need for iomkl is less.

@migueldiascosta migueldiascosta added this to the 3.8.0 milestone Nov 26, 2018
Copy link
Copy Markdown
Member

@migueldiascosta migueldiascosta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@migueldiascosta
Copy link
Copy Markdown
Member

Going in, thanks @schiotz!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants