Add support for IntelMPI in TensorFlow.#1507
Conversation
We need a separate wrappet for mpiicc (and probably for mpicxx if using non-Intel compilers) because IntelMPI needs I_MPI_ROOT to be set. This patch only adds support for IntelMPI with Intel compilers. Also simplify PATH setting for the wrappers.
|
@akesandgren How does the problem that you're fixing here manifest itself when building TF with |
|
|
||
| # patch configure.py (called by configure script) to avoid that Bazel abuses $HOME/.cache/bazel | ||
| # Should perhaps set --install_base and --output_user_root too | ||
| # bazel still uses $HOME/.cache/bazel for some things. |
There was a problem hiding this comment.
@akesandgren open an issue for this, no point in keeping track of TODOs in comments (especially without mentioning TODO or FIXME ;-))
There was a problem hiding this comment.
@akesandgren Please remove these added comments and open an issue instead?
| env.setvar('PATH', os.pathsep.join([icc_wrapper_dir, os.getenv('PATH')])) | ||
|
|
||
| if use_mpi: | ||
| mpiicc_wrapper_dir = os.path.join(tmpdir, 'bin2') |
There was a problem hiding this comment.
@akesandgren Why create a separate wrapper dir?
I don't see the point of that, the icc and mpiicc wrappers can live in the same place?
Added benefit is that you only need to update $PATH once.
There was a problem hiding this comment.
The wrapper removes its own path from the environment so mpiicc won't be able to find the icc wrapper if they live in the same wrapper dir i think. Haven't properly tested that though.
But on the other hand it should inherit the env that mpiicc is using....
Need to test this...
There was a problem hiding this comment.
Hmm, right, I forgot about that part... Better make that clear with a comment, and use a better name than bin2 ;)
|
|
||
| icc_wrapper_txt = INTEL_COMPILER_WRAPPER % { | ||
| 'compiler_path': which('icc'), | ||
| 'intel_mpi_root': '', |
There was a problem hiding this comment.
@akesandgren I would use a more meaningful value here, like UNKNOWN or IRRELEVANT, in case someone inspects the generated wrapper?
|
|
||
| export CPATH='%(cpath)s' | ||
| export INTEL_LICENSE_FILE='%(intel_license_file)s' | ||
| export I_MPI_ROOT='%(intel_mpi_root)s' |
There was a problem hiding this comment.
maybe also include a comment line above:
# only relevant for MPI compiler wrapper (mpiicc), not for regular compiler (icc)
export I_MPI_ROOT='%(intel_mpi_root)s'| 'compiler_path': which('mpiicc'), | ||
| 'intel_mpi_root': os.getenv('I_MPI_ROOT'), | ||
| 'cpath': os.getenv('CPATH'), | ||
| 'intel_license_file': os.getenv('INTEL_LICENSE_FILE', os.getenv('LM_LICENSE_FILE')), |
There was a problem hiding this comment.
Since we use these in two places, we should grab them only once from the environment above:
wrapper_cpath = os.getenv('CPATH')
wrapper_intel_licfile = os.getenv('INTEL_LICENSE_FILE', os.getenv('LM_LICENSE_FILE'))| } | ||
| mpiicc_wrapper = os.path.join(mpiicc_wrapper_dir, 'mpiicc') | ||
| write_file(mpiicc_wrapper, mpiicc_wrapper_txt) | ||
| env.setvar('PATH', os.pathsep.join([mpiicc_wrapper_dir, os.getenv('PATH')])) |
There was a problem hiding this comment.
this can be removed if we let the wrapper share the same dir
| if use_mpi: | ||
| impi_root = get_software_root('impi') | ||
| if impi_root: | ||
| mpi_home = os.path.join(impi_root, 'intel64') |
There was a problem hiding this comment.
probably worth adding some debug logging here (only under use_mpi):
self.log.debug("Derived value for MPI_HOME: %s", mpi_home)|
WIll take a look at the comments tomorrow. |
|
Without the mpiicc wrapper and setting of I_MPI_ROOT I get this when linking: The reason for that is, that without I_MPI_ROOT defined mpiicc passes the wrong -L args down to icc. |
IntelMPI compiler wrappers need a TF/Bazel wrapper even for non-intel compiler based toolchains. Add wrappers for C++ compiler and MPI wrappers. We only need a single wrapper directory, if the main compiler is an MPI compiler the actual compiler will be called dierectly from inside the MPI compiler wrapper and thus the environment will remain correct.
|
Think about a helper func looking like this: Should that be called as: or some other way? (Still learning python :-) And should i make the configure_step tmpdir variable into self.tmpdir instead? Would still need wrapper_dir available outside of the function to handle the PATH extension, so perhaps have wrapper_dir as param instead of tmpdir? |
Eliminates repeated code and increases readability.
| use_mpi = self.toolchain.options.get('usempi', False) | ||
| mpi_home = '' | ||
| if use_mpi: | ||
| impi_root = get_software_root('impi') |
There was a problem hiding this comment.
@akesandgren Since you're using this in two places in configure_step, why not combine this with the exact same check above to put the mpiicc wrapper in place?
use_mpi = self.toolchain.options.get('usempi', False)
impi_root = get_software_root('impi')
mpi_home = ''
if use_mpi and impi_root:
# put wrappers for Intel MPI compiler wrappers in place
...
# set correct value for MPI_HOME
mpi_home = os.path.join(impi_root, 'intel64')|
Looks good, tested with several existing |
We need a separate wrappet for mpiicc (and probably for mpicxx if
using non-Intel compilers) because IntelMPI needs I_MPI_ROOT to be set.
This patch only adds support for IntelMPI with Intel compilers.
Also simplify PATH setting for the wrappers.