Skip to content

Clang 3.4 easyconfigs#653

Merged
boegel merged 12 commits intoeasybuilders:developfrom
wpoely86:clang34
Apr 3, 2014
Merged

Clang 3.4 easyconfigs#653
boegel merged 12 commits intoeasybuilders:developfrom
wpoely86:clang34

Conversation

@wpoely86
Copy link
Copy Markdown
Member

@wpoely86 wpoely86 commented Jan 9, 2014

@hpcugentbot
Copy link
Copy Markdown

Automatic reply from Jenkins: Can I test this?

@wpoely86
Copy link
Copy Markdown
Member Author

Depends on easybuilders/easybuild-framework#812 to build

@wpoely86
Copy link
Copy Markdown
Member Author

Jenkins: ok to test

@wpoely86
Copy link
Copy Markdown
Member Author

wpoely86 commented Feb 3, 2014

Now depends on easybuilders/easybuild-easyblocks#351

@boegel
Copy link
Copy Markdown
Member

boegel commented Feb 13, 2014

@wpoely86: the build of Clang 3.4 is hanging on my end, it's stuck in stage 1:

vsc40023  91112  0.0  0.0  22192  1424 pts/1    S+   20:47   0:00      |                           |   \_ /bin/bash /tmp/vsc40023/easybuild_build/Clang/3.4/GCC-4.8.2/llvm.obj.1/test/Object/Output/directory.ll.script
vsc40023 106408 93.1 25.5 33577416 16846136 pts/1 R+ 21:35  93:13      |                           |       \_ /tmp/vsc40023/easybuild_build/Clang/3.4/GCC-4.8.2/llvm.obj.1/bin/./llvm-ar r /tmp/vsc40023/easybuild_build/Clang/3.4/GCC-4.8.2/llvm.obj.1/test/Object/Output/test.a /tmp/vsc40023/easybuild_build/Clang/3.4/GCC-4.8.2/llvm.obj.1/test/Object/Output/a-very-long-file-name

@wpoely86
Copy link
Copy Markdown
Member Author

@boegel clang 3.4 builds fine here. Stage 1 takes about 10 minutes on with -j8.

@boegel
Copy link
Copy Markdown
Member

boegel commented Feb 14, 2014

@wpoel86: where's 'here'? it was hanging in the stage1 tests iirc, and I think we've been through this before :)

@wpoely86
Copy link
Copy Markdown
Member Author

@boegel I've redone the build on one of my machines and works flawless (build takes about 1.5h with -j4).

You seem to have this habit of breaking my easyconfigs 😉

@wpoely86
Copy link
Copy Markdown
Member Author

@gribozavr Can you test this build? There seems to be an issue with test suite after stage 1 on our HPC system (works fine on my desktop pc).

@fgeorgatos
Copy link
Copy Markdown
Contributor

hi folks, how about trying EASYBUILD_BUILDPATH=/dev/shm, to rule out annoying fs related issues?
(this does not prevent issues at install step, but by that time you know what to suspect first...)

@wpoely86
Copy link
Copy Markdown
Member Author

@fgeorgatos I'm already testing several possibilities to rule out fs. On my plain old ext4, it builds fine.

@fgeorgatos
Copy link
Copy Markdown
Contributor

@wpoely86: a decade ago, a very obscure bug hit a grid cluster I was on, which was traced down to GPFS/2.2 not supporting a given length of symlinks (that was not POSIX compliant!). Lesson is, bugs like that can creep anywhere, and I noticed somewhere in this PR a ref. to a_very_long_filename. So, let's assume similar situations may pop up anywhere and play safe...

@boegel
Copy link
Copy Markdown
Member

boegel commented Feb 15, 2014

In my testing, the build directory is also ext4, only the install dir is GPFS (but it never gets that far).

@wpoely86
Copy link
Copy Markdown
Member Author

OK, so far I've been able to build stage 1 with -j1 and run the tests but: a lot of tests fail, mostly those related with the various sanitizers. I'm beginning to think that their test suite does not like our build environment. I wonder if you would build it plainly on a node (not using pbs or cgroups) would give different results. Several tests fail due to limits in the vmem (ulimit -v)

  Expected Passes    : 13136
  Expected Failures  : 69
  Unsupported Tests  : 3291
  Unexpected Failures: 362

I'm going to try to build and ignore the tests and see what I get.

@gribozavr
Copy link
Copy Markdown
Contributor

@wpoely86 Unfortunately, I can not test the build, but it is quite possible that the tests are failing because of the limits set. ASan and TSan require terabytes of virtual memory -- just to map it, not to actually allocate it.

@wpoely86
Copy link
Copy Markdown
Member Author

OK, I've done a complete 3 stage build without running the tests and the clang binary seems to work just fine. Maybe we just have to disable all the sanitizer tests? The linkage of the final binary is strange:

    linux-vdso.so.1 =>  (0x00007fff436a9000)
    /$LIB/snoopy.so => /lib64/snoopy.so (0x00002b660af60000)
    librt.so.1 => /lib64/librt.so.1 (0x00002b660b161000)
    libdl.so.2 => /lib64/libdl.so.2 (0x00002b660b36a000)
    libtinfo.so.5 => /lib64/libtinfo.so.5 (0x00002b660b56e000)
    libpthread.so.0 => /lib64/libpthread.so.0 (0x00002b660b78f000)
    libz.so.1 => /apps/gent/SL6/sandybridge/software/zlib/1.2.8-ictce-5.5.0/lib/libz.so.1 (0x00002b660b9ad000)
    libstdc++.so.6 => /user/data/gent/gvo000/gvo00003/apps/delcatty/software/GCC/4.8.2/lib64/libstdc++.so.6 (0x00002b660bbc7000)
    libm.so.6 => /lib64/libm.so.6 (0x00002b660bed0000)
    libgcc_s.so.1 => /user/data/gent/gvo000/gvo00003/apps/delcatty/software/GCC/4.8.2/lib64/libgcc_s.so.1 (0x00002b660c155000)
    libc.so.6 => /lib64/libc.so.6 (0x00002b660c36b000)
    /lib64/ld-linux-x86-64.so.2 (0x00002b660ad3e000)
    libimf.so => /apps/gent/SL6/sandybridge/software/ifort/2013.5.192/compiler/lib/intel64/libimf.so (0x00002b660c700000)
    libsvml.so => /apps/gent/SL6/sandybridge/software/ifort/2013.5.192/compiler/lib/intel64/libsvml.so (0x00002b660cbbc000)
    libirng.so => /apps/gent/SL6/sandybridge/software/ifort/2013.5.192/compiler/lib/intel64/libirng.so (0x00002b660d586000)
    libintlc.so.5 => /apps/gent/SL6/sandybridge/software/ifort/2013.5.192/compiler/lib/intel64/libintlc.so.5 (0x00002b660d78e000)

Those last libraries of ifort should not be there. Those modules are not loaded and we use the GCC toolchain.

@boegel Any change you can run the build with ulimit -v unlimited?

@wpoely86
Copy link
Copy Markdown
Member Author

Status update: I've managed to run with with ulimit -v unlimited on our cluster and all the sanitizer tests succeed then but some others fail:

Failing Tests (11):
    Clang :: Misc/dev-fd-fs.c
    Clang :: Modules/prune.m
    LLVM :: Object/ar-create.test
    LLVM :: Object/archive-extract-dir.test
    LLVM :: Object/archive-format.test
    LLVM :: Object/archive-replace-pos.test
    LLVM :: Object/archive-symtab.test
    LLVM :: Object/archive-update.test
    LLVM :: Object/directory.ll
    LLVM :: Object/extract.ll
    LLVM :: Object/nm-archive.test

  Expected Passes    : 13703
  Expected Failures  : 69
  Unsupported Tests  : 3291
  Unexpected Failures: 11

The full log is here: http://pastebin.com/WwdsbPr6
@gribozavr Can you shed light on how serious this is? The first two seems minor to me but the rest is unclear. They all seems to fail on the same assert in llvm-ar.

The result I show you are from the stage 3 build.

@gribozavr
Copy link
Copy Markdown
Contributor

The first two failures are "harmless", but the rest of it is alarming.

@boegel
Copy link
Copy Markdown
Member

boegel commented Feb 22, 2014

@wpoely86: The linkage to ifort libraries is very strange... Someone ran into something similar during the EasyBuild hackathon at JSC last week, so this is worth looking into first.

@ocaisa: You saw problems with Intel stuff being linked in even though no Intel toolchain was involved, right?

@wpoely86
Copy link
Copy Markdown
Member Author

@boegel The linkage to ifort has magically disappeared after a rebuild. Now, I get:

    linux-vdso.so.1 =>  (0x00007fffa33ff000)
    /$LIB/snoopy.so => /lib64/snoopy.so (0x00002ae891cab000)
    librt.so.1 => /lib64/librt.so.1 (0x00002ae891eac000)
    libdl.so.2 => /lib64/libdl.so.2 (0x00002ae8920b4000)
    libtinfo.so.5 => /lib64/libtinfo.so.5 (0x00002ae8922b8000)
    libpthread.so.0 => /lib64/libpthread.so.0 (0x00002ae8924da000)
    libz.so.1 => /lib64/libz.so.1 (0x00002ae8926f7000)
    libstdc++.so.6 => /user/data/gent/gvo000/gvo00003/apps/delcatty/software/GCC/4.8.2/lib64/libstdc++.so.6 (0x00002ae89290d000)
    libm.so.6 => /lib64/libm.so.6 (0x00002ae892c17000)
    libgcc_s.so.1 => /user/data/gent/gvo000/gvo00003/apps/delcatty/software/GCC/4.8.2/lib64/libgcc_s.so.1 (0x00002ae892e9b000)
    libc.so.6 => /lib64/libc.so.6 (0x00002ae8930b1000)
    /lib64/ld-linux-x86-64.so.2 (0x00002ae891a89000)

The linking to libtinfo.so.5 (part of ncurses) does not happen on my machine.

Anyway, I'm more intrigued by the failing asserts in llvm-ar for the moment. I've gonna try a manual build of clang and see what happens...

@wpoely86
Copy link
Copy Markdown
Member Author

Hurrah! The culprit has been found: very long uid's. It's already fixed upstream so I back ported part of the patch that we need.

All the sanitizer tests pass when ulimit -v is unlimited, so I'm gonna make the easyblock handle that.

@boegel
Copy link
Copy Markdown
Member

boegel commented Apr 3, 2014

tested this on top of easybuilders/easybuild-easyblocks#366, works like a charm

thanks @wpoely86!

boegel added a commit that referenced this pull request Apr 3, 2014
@boegel boegel merged commit 1e94163 into easybuilders:develop Apr 3, 2014
@wpoely86 wpoely86 deleted the clang34 branch April 3, 2014 13:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants