PPPL deep learning disruption prediction package
module load anaconda cudatoolkit/7.5 cudann openmpi/intel-16.0/1.8.8/64
source activate environment
python setup.py installWhere environment should contain the Python packages as per requirements.txt file.
python guarantee_preprocessed.pyUse Slurm scheduler to perform batch or interactive analysis on Tiger cluster.
For batch analysis, make sure to allocate 1 process per GPU:
#SBATCH -N X
#SBATCH --ntasks-per-node=4
#SBATCH --ntasks-per-socket=2
#SBATCH --gres=gpu:4where X is the number of nodes for distibuted data parallel training.
sbatch slurm.cmdThe workflow is to request an interactive session:
salloc -N [X] --ntasks-per-node=16 --ntasks-per-socket=8 --gres=gpu:4 -t 0-6:00where the number of GPUs is X * 4.
Then launch the application from the command line:
cd plasma-python
mpirun -npernode 4 python examples/mpi_learn.pyNote: there is Theano compilation going on in the 1st epoch which will distort timing. It is recommended to perform testing setting num_epochs >= 2 in conf.py.