Skip to content

add patch for Tensorflow 1.6 & 1.7 to include missing -lrt link flag (needed in CentOS6)#6089

Merged
boegel merged 4 commits intoeasybuilders:3.6.xfrom
Micket:20180328185104_new_pr_msIXpLMEft
Apr 26, 2018
Merged

add patch for Tensorflow 1.6 & 1.7 to include missing -lrt link flag (needed in CentOS6)#6089
boegel merged 4 commits intoeasybuilders:3.6.xfrom
Micket:20180328185104_new_pr_msIXpLMEft

Conversation

@Micket
Copy link
Copy Markdown
Contributor

@Micket Micket commented Mar 28, 2018

(created using eb --new-pr)

@boegel
Copy link
Copy Markdown
Member

boegel commented Mar 29, 2018

@Micket Just adding the patch without using it in any of the easyconfigs is a bit confusing...

Does including this patch on CentOS 7 or other operating systems cause problems? If not, we should always use it?

@boegel boegel added the bug fix label Mar 29, 2018
@Micket
Copy link
Copy Markdown
Contributor Author

Micket commented Mar 29, 2018

@boegel Sorry, I was going to comment it, but --new-pr took quite a long time, and I forgot about this.

The patch is something I needed to use with Tensorflow in order to built it on CentOS6, because of a link-flag that doesn't get propagated. I can't imagine it hurting compilation on any other system, because it will be linking to this library anyway. I got the fix from:
tensorflow/tensorflow#15129

Though, on CenOS6, I still need to modify the original configs with_jemalloc=False (or some other workaround, due to the old kernel).
But I figured it would at least be useful to have it available for those on CentOS6

@boegel
Copy link
Copy Markdown
Member

boegel commented Mar 29, 2018

@Micket Can you update this PR to use this patch in the TensorFlow 1.6.0 easyconfigs we have, so we can easily test it on other OSs?

If it doesn't cause problems there, I would include it, one less problem to worry about on CentOS 6...

@boegel
Copy link
Copy Markdown
Member

boegel commented Mar 29, 2018

Test report by @boegel
SUCCESS
Build succeeded for 1 out of 1 (1 easyconfigs in this PR)
node2445.golett.os - Linux centos linux 7.4.1708, Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz, Python 2.7.5
See https://gist.github.com/f9a126cab36548e5fe94142e070e6122 for a full test report.

@boegel
Copy link
Copy Markdown
Member

boegel commented Mar 29, 2018

@vanzod Up for testing this to make sure the patch doesn't cause any trouble on other OSs?

@boegel
Copy link
Copy Markdown
Member

boegel commented Mar 29, 2018

@Micket Any idea if this is relevant when building with Intel compilers? Did you try that?

@Micket
Copy link
Copy Markdown
Contributor Author

Micket commented Mar 29, 2018

@boegel I think it would apply there as well, but I haven't tested.
I didn't grab the intel version because it pulled in more dependencies from the dev branch (and I haven't gotten around to building the intel 2018 toolchains yet).

I'll give it a try

@Micket
Copy link
Copy Markdown
Contributor Author

Micket commented Mar 29, 2018

@boegel intel/2018a requires us to first update the license server to 11.14, which it refused to do on our virtual machines due to the cpu model number 😲
Long story short; I can't get around to testing the intel version.

@boegel
Copy link
Copy Markdown
Member

boegel commented Mar 29, 2018

@Micket I would still consider also using it in the intel/2018a TensorFlow 1.6 easyconfig, we can easily test whether it breaks on CentOS 7. If not, I'd include it.

Also, do you think TF 1.7.0 also suffers from this same issue? Asking for a friend, see #6098

@Micket
Copy link
Copy Markdown
Contributor Author

Micket commented Mar 29, 2018

@boegel I tried building #6098 and I get the same error (undefined reference to 'clock_gettime'). The patch should help there as well.

@boegel
Copy link
Copy Markdown
Member

boegel commented Mar 30, 2018

@Micket Can you sync with PR with current develop and also include the patch for TF 1.7.0? Makes sense to do that in a single PR...

@boegel
Copy link
Copy Markdown
Member

boegel commented Mar 30, 2018

Test report by @boegel
SUCCESS
Build succeeded for 2 out of 2 (2 easyconfigs in this PR)
node2118.delcatty.os - Linux centos linux 7.4.1708, Intel(R) Xeon(R) CPU E5-2670 0 @ 2.60GHz, Python 2.7.5
See https://gist.github.com/dbfb90aaa30966db1068ca2933f25e79 for a full test report.

@boegel
Copy link
Copy Markdown
Member

boegel commented Mar 30, 2018

Test report by @boegel
SUCCESS
Build succeeded for 2 out of 2 (2 easyconfigs in this PR)
node2531.golett.os - Linux centos linux 7.4.1708, Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz, Python 2.7.5
See https://gist.github.com/a63e8c8b4680cf6c31644482f92ca89b for a full test report.

@Micket Micket force-pushed the 20180328185104_new_pr_msIXpLMEft branch from b4d0bf3 to 8130848 Compare April 3, 2018 10:48
@Micket
Copy link
Copy Markdown
Contributor Author

Micket commented Apr 3, 2018

@boegel Done, should be ready for another test build for 1.7.0

@boegel boegel added this to the 3.6.0 milestone Apr 25, 2018
@boegel
Copy link
Copy Markdown
Member

boegel commented Apr 25, 2018

Test report by @boegel
FAILED
Build succeeded for 2 out of 3 (3 easyconfigs in this PR)
node2644.swalot.os - Linux centos linux 7.4.1708, Intel(R) Xeon(R) CPU E5-2660 v3 @ 2.60GHz, Python 2.7.5
See https://gist.github.com/81b752064b961f7621193dc1233f5545 for a full test report.

@boegel
Copy link
Copy Markdown
Member

boegel commented Apr 25, 2018

@Micket this needs another sync with develop, since it's missing the checksum fix in #6147 (that's why the test report failed)

@boegel boegel changed the title Tensorflow patch adding -lrt link flag (needed in CentOS6) add patch for Tensorflow 1.6 & 1.7 to include missing -lrt link flag (needed in CentOS6) Apr 26, 2018
@Micket Micket force-pushed the 20180328185104_new_pr_msIXpLMEft branch from 8130848 to 4ac8b8d Compare April 26, 2018 10:59
boegel added a commit to boegel/easybuild-easyconfigs that referenced this pull request Apr 26, 2018
@boegel boegel changed the base branch from develop to 3.6.x April 26, 2018 11:30
@boegel
Copy link
Copy Markdown
Member

boegel commented Apr 26, 2018

I've tested this outside of this PR after merging this branch with current develop to fix the checksum issue for TF 1.7.0, and didn't see any problems with it, so I'll go ahead and merge this into the 3.6.x branch to be included in EasyBuild v3.6.0.

Thanks @Micket!

@boegel boegel merged commit a181303 into easybuilders:3.6.x Apr 26, 2018
@Micket Micket deleted the 20180328185104_new_pr_msIXpLMEft branch May 2, 2019 19:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants