Skip to content

extra patch for TensorFlow 1.12.0 to remove -B/usr/bin from linker_bin_path_flag in cuda_configure.bzl#7800

Merged
akesandgren merged 3 commits intoeasybuilders:developfrom
FokkeDijkstra:tensorflow-fosscuda-2018b-extra-patch
Apr 2, 2019
Merged

extra patch for TensorFlow 1.12.0 to remove -B/usr/bin from linker_bin_path_flag in cuda_configure.bzl#7800
akesandgren merged 3 commits intoeasybuilders:developfrom
FokkeDijkstra:tensorflow-fosscuda-2018b-extra-patch

Conversation

@FokkeDijkstra
Copy link
Copy Markdown
Contributor

Added extra patch to remove -B/usr/bin from linker_bin_path_flag in
tensorflow-1.12.0/third_party/gpus/cuda_configure.bzl
This prevents the wrong linker from being used.

tensorflow-1.12.0/third_party/gpus/cuda_configure.bzl
This prevents the wrong linker from being used.
@boegel
Copy link
Copy Markdown
Member

boegel commented Mar 9, 2019

@FokkeDijkstra How does this problem manifest itself, can you show an example of an error?

@boegel boegel added the bug fix label Mar 9, 2019
@boegel boegel added this to the 3.x milestone Mar 9, 2019
@boegel boegel requested a review from akesandgren March 9, 2019 21:04
@boegel boegel changed the title Extra patch to remove -B/usr/bin from linker_bin_path_flag in cuda_configure.bzl extra patch for TensorFlow 1.12.0 to remove -B/usr/bin from linker_bin_path_flag in cuda_configure.bzl Mar 9, 2019
akesandgren
akesandgren previously approved these changes Mar 9, 2019
Copy link
Copy Markdown
Contributor

@akesandgren akesandgren left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, makes sense do do this.

@boegel
Copy link
Copy Markdown
Member

boegel commented Mar 9, 2019

Patch should also be used in TensorFlow-1.12.0-fosscuda-2018b-Python-2.7.15.eb?

@akesandgren
Copy link
Copy Markdown
Contributor

Yeah probably. I see no reason why not.

@FokkeDijkstra
Copy link
Copy Markdown
Contributor Author

@FokkeDijkstra How does this problem manifest itself, can you show an example of an error?

I get errors like this when building TensorFlow on our CentOS 7.5 system:

  external/local_config_cuda/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc -o bazel-out/k8-opt/bin/external/protobuf_archive/js_embed -Wl,-no-as-needed -pie -Wl,-z,relro,-z,now '-Wl,--build-id=md5' '-Wl,--hash-style=gnu' -no-canonical-prefixes -B/usr/bin -Wl,--gc-sections -Wl,@bazel-out/k8-opt/bin/external/protobuf_archive/js_embed-2.params)
/usr/bin/ld.gold: error: /software/software/GCCcore/7.3.0/lib/gcc/x86_64-pc-linux-gnu/7.3.0/crtbeginS.o: unsupported reloc 42 against global symbol _ITM_deregisterTMCloneTable
/usr/bin/ld.gold: error: /software/software/GCCcore/7.3.0/lib/gcc/x86_64-pc-linux-gnu/7.3.0/crtbeginS.o: unsupported reloc 42 against global symbol _ITM_registerTMCloneTable
/usr/bin/ld.gold: error: bazel-out/k8-opt/bin/external/protobuf_archive/_objs/js_embed/embed.o: unsupported reloc 42 against global symbol std::ios_base::Init::~Init()
/software/software/GCCcore/7.3.0/lib/gcc/x86_64-pc-linux-gnu/7.3.0/crtbeginS.o(.text+0x1a): error: unsupported reloc 42
/software/software/GCCcore/7.3.0/lib/gcc/x86_64-pc-linux-gnu/7.3.0/crtbeginS.o(.text+0x6b): error: unsupported reloc 42
bazel-out/k8-opt/bin/external/protobuf_archive/_objs/js_embed/embed.o:embed.cc:function _GLOBAL__sub_I_main: error: unsupported reloc 42
collect2: error: ld returned 1 exit status

@FokkeDijkstra
Copy link
Copy Markdown
Contributor Author

Patch should also be used in TensorFlow-1.12.0-fosscuda-2018b-Python-2.7.15.eb?

I'll add it to this PR.

tensorflow-1.12.0/third_party/gpus/cuda_configure.bzl
This prevents the wrong linker from being used.
Copy link
Copy Markdown
Contributor

@akesandgren akesandgren left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@akesandgren
Copy link
Copy Markdown
Contributor

Test report by @akesandgren
SUCCESS
Build succeeded for 6 out of 6 (2 easyconfigs in this PR)
b-an03.hpc2n.umu.se - Linux ubuntu 16.04, Intel(R) Xeon(R) CPU E5-2690 v4 @ 2.60GHz, Python 2.7.12
See https://gist.github.com/5109341fcf1dcd3119c05201948060c6 for a full test report.

@akesandgren
Copy link
Copy Markdown
Contributor

Going in, thanks @FokkeDijkstra!

@akesandgren akesandgren merged commit 0947158 into easybuilders:develop Apr 2, 2019
@boegel boegel modified the milestones: 3.x, 3.9.0 Apr 2, 2019
@boegel
Copy link
Copy Markdown
Member

boegel commented Apr 3, 2019

Test report by @boegel
SUCCESS
Build succeeded for 2 out of 2 (2 easyconfigs in this PR)
nic170 - Linux centos linux 7.5.1804, Intel(R) Xeon(R) CPU E5-2683 v4 @ 2.10GHz, Python 2.7.5
See https://gist.github.com/517bac185aa9cb60c6ef0ab2c36d5531 for a full test report.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants