Skip to content

don't specify --ntasks-per-node when submitting Slurm jobs#2887

Merged
ocaisa merged 1 commit intoeasybuilders:developfrom
boegel:slurm_ntasks_per_node
May 18, 2019
Merged

don't specify --ntasks-per-node when submitting Slurm jobs#2887
ocaisa merged 1 commit intoeasybuilders:developfrom
boegel:slurm_ntasks_per_node

Conversation

@boegel
Copy link
Copy Markdown
Member

@boegel boegel commented May 17, 2019

Currently, jobs to a Slurm backend are being submitted with "sbatch --nodes 1 --ntasks 5 --ntasks-per-node 5" when "eb --job --job-cores 5" is used.

That works fine, until you change your mind after the jobs were submitted and you want to increase the number of cores using scontrol update job=<jobid> NumTasks=10 NumCPUS=10.

The --ntasks-per-node 5 that was used at submission time can not be changed afterwards, and thus it prevents you from increasing the number of requested cores for that job (if you do try to increase the number of cores/tasks, the job gets stuck due to BadConstraints).

I see no need for using --ntasks-per-node at all, so let's just remove it...
Afaik, this shouldn't cause any problems.

@boegel boegel added the change label May 17, 2019
@boegel boegel added this to the 3.9.1 milestone May 17, 2019
@boegel boegel requested a review from akesandgren May 17, 2019 22:00
@ocaisa ocaisa merged commit 4da8a51 into easybuilders:develop May 18, 2019
@boegel boegel deleted the slurm_ntasks_per_node branch May 18, 2019 11:16
@akesandgren
Copy link
Copy Markdown
Contributor

Yes, I agree that --ntasks-per-node should not be used. But nor should --nodes be, you should only use -n and -c and let the batch system take care of the rest.

@ocaisa
Copy link
Copy Markdown
Member

ocaisa commented May 18, 2019

Not really, you need all the cores on the same node, make still has a ways before to go before it can manage multiple nodes at once!

@akesandgren
Copy link
Copy Markdown
Contributor

Yes, but mpi tests don't. And it's up to the user to specify sane things. EB shouldn't get in my way...
Say for instance that i have to build something that needs to run mpi during testing, and the test requires 4 MPI tasks. If I only have single core nodes then I still need -n 4 and EB should NOT add -N 1 to that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants