Skip to content

Make TensorFlow use system/EasyBuild installed libraries#2117

Merged
boegel merged 6 commits intoeasybuilders:developfrom
Flamefire:tensorflow_syslibs
Sep 4, 2020
Merged

Make TensorFlow use system/EasyBuild installed libraries#2117
boegel merged 6 commits intoeasybuilders:developfrom
Flamefire:tensorflow_syslibs

Conversation

@Flamefire
Copy link
Copy Markdown
Contributor

@Flamefire Flamefire commented Aug 13, 2020

TensorFlow allows using already installed libs. This avoids problems with the bundled TF libs (e.g. in 2.3.0) and speeds up the build process.

closes #1996

edit: now requires easybuilders/easybuild-framework#3426

@boegel
Copy link
Copy Markdown
Member

boegel commented Aug 13, 2020

Tested with:

  • TensorFlow-1.13.1-foss-2019a-Python-3.7.2.eb
  • TensorFlow-1.15.2-fosscuda-2019b-Python-3.7.4.eb
  • TensorFlow-2.1.0-foss-2019b-Python-3.7.4.eb
  • TensorFlow-2.2.0-foss-2019b-Python-3.7.4.eb
  • TensorFlow-2.2.0-fosscuda-2019b-Python-3.7.4.eb

@boegel boegel added this to the next release (4.2.3?) milestone Aug 13, 2020
@Flamefire
Copy link
Copy Markdown
Contributor Author

Flamefire commented Aug 13, 2020

Thanks for the testing. This change should currently have no effect as the dependencies are missing. I added them to the 2.1.0 ECs in easybuilders/easybuild-easyconfigs#11109 so you can add those to your tests too if you like.

Edit: While doing this I noticed that there is a compiler wrapper for intel which sets some env vars. The better approach would be to use --action_env which trims down the ~100 lines to about 4. I'm also suspicious about 'intel_license_file': os.getenv('INTEL_LICENSE_FILE', os.getenv('LM_LICENSE_FILE')),. Is it really ok to set INTEL_LICENSE_FILE=$LM_LICENSE_FILE (and removing $LM_LICENSE_FILE) basically renaming the var? That could also be fixed by that change

@Flamefire
Copy link
Copy Markdown
Contributor Author

Also another idea: As TF likes to add/remove/rename packages, maybe we should add all names to the EasyBlock and verify the list. Even when we don't have a corresponding EC yet.

Comment thread easybuild/easyblocks/t/tensorflow.py Outdated
Comment thread easybuild/easyblocks/t/tensorflow.py Outdated
Comment thread easybuild/easyblocks/t/tensorflow.py Outdated
Comment thread easybuild/easyblocks/t/tensorflow.py Outdated
Comment thread easybuild/easyblocks/t/tensorflow.py Outdated
Comment thread easybuild/easyblocks/t/tensorflow.py Outdated
Comment thread easybuild/easyblocks/t/tensorflow.py Outdated
Comment thread easybuild/easyblocks/t/tensorflow.py Outdated
Comment thread easybuild/easyblocks/t/tensorflow.py
Comment thread easybuild/easyblocks/t/tensorflow.py Outdated
@Flamefire
Copy link
Copy Markdown
Contributor Author

I addressed your comments and also some changes:

  • introduce function get_system_libs_for_version which does the version filtering and can be used to render a list that can be diffed against the TF included list on updates. IMO crucial to keep up to date. We might even want to do this automatically during the build, what do you think?
  • Add ALL TF known libs to a) allow for the above and b) see which are missing to potentially add new ECs or find the correct ones
  • Combined the 2 lists into 1 and use values of None and True to denote missing ECs and "assume installed"
  • Change order of tf_name and eb_name and sort lines. There are benefits in doing it this way (e.g. check against the tf list) but maybe the other way round is better as it allows to see name changes more easily as EC names stay (usually) constant (see jpeg). I tend to revert the order again but I'm still not 100% happy with the list anyway. Not great to read. Any ideas?

@Flamefire Flamefire marked this pull request as draft August 14, 2020 15:00
@Flamefire
Copy link
Copy Markdown
Contributor Author

Ok, still need to do a few adjustments. I'm adding and testing more ECs to be used as dependencies. To check I added a log which dependencies that could be used by TF are not.

The main dependency I'd like to include is protobuf but that needs another patch and a test. Checking now if that works...

@Flamefire Flamefire force-pushed the tensorflow_syslibs branch 2 times, most recently from f5a604f to 5acb11e Compare August 17, 2020 13:47
@Flamefire Flamefire marked this pull request as ready for review August 17, 2020 14:07
Comment thread easybuild/easyblocks/t/tensorflow.py Outdated
Comment thread easybuild/easyblocks/t/tensorflow.py Outdated
Comment thread easybuild/easyblocks/t/tensorflow.py
Comment thread easybuild/easyblocks/t/tensorflow.py Outdated
Comment thread easybuild/easyblocks/t/tensorflow.py Outdated
Comment thread easybuild/easyblocks/t/tensorflow.py Outdated
Comment thread easybuild/easyblocks/t/tensorflow.py Outdated
Comment thread easybuild/easyblocks/t/tensorflow.py Outdated
Comment thread easybuild/easyblocks/t/tensorflow.py Outdated
Comment thread easybuild/easyblocks/t/tensorflow.py Outdated
Comment thread easybuild/easyblocks/t/tensorflow.py Outdated
Comment thread easybuild/easyblocks/t/tensorflow.py
@Flamefire
Copy link
Copy Markdown
Contributor Author

@boegel needed to rebase to include the ninja fix to test this. New commits are starting at "Address review"

Comment thread easybuild/easyblocks/t/tensorflow.py Outdated
Comment thread easybuild/easyblocks/t/tensorflow.py Outdated
Comment thread easybuild/easyblocks/t/tensorflow.py
Comment thread easybuild/easyblocks/t/tensorflow.py
TensorFlow allows using already installed libs. This avoids problems with the bundled TF libs (e.g. in 2.3.0) and speeds up the build process.
Split definitions of $TF_SYSTEM_LIBS into multiple structures
Change the structure of those
Check for availability of Python packages
Make some functions module level instead of class level
Check against TF sources and print warning on outdated EasyBlock definitions
That version is missing spaces around the assignment operator
@boegel
Copy link
Copy Markdown
Member

boegel commented Sep 4, 2020

Doing a last round of testing, but this should be good to go now... Thanks a lot for your efforts on this @Flamefire!

@boegel
Copy link
Copy Markdown
Member

boegel commented Sep 4, 2020

This has been tested extensively, both with a wide range of existing TensorFlow easyconfigs and the updated ones for TensorFlow 2.1.0 from easybuilders/easybuild-easyconfigs#11109 that require these changes, so going in!

@boegel boegel merged commit 1d64409 into easybuilders:develop Sep 4, 2020
@Flamefire Flamefire deleted the tensorflow_syslibs branch September 4, 2020 15:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

TensorFlow ignores system dependencies

2 participants