Skip to content

Increase test timeouts for PyTorch 1.8.1 and 1.9.0#13700

Merged
boegel merged 1 commit intoeasybuilders:developfrom
Flamefire:20210811120129_new_pr_PyTorch181
Aug 12, 2021
Merged

Increase test timeouts for PyTorch 1.8.1 and 1.9.0#13700
boegel merged 1 commit intoeasybuilders:developfrom
Flamefire:20210811120129_new_pr_PyTorch181

Conversation

@Flamefire
Copy link
Copy Markdown
Contributor

@Flamefire Flamefire commented Aug 11, 2021

(created using eb --new-pr)

Increases the test timeouts for distributed tests as there were reports of those timing out with this "low" value. This seems to work.

Note: Change is trivial enough to test with --fetch only

@boegel
Copy link
Copy Markdown
Member

boegel commented Aug 11, 2021

@boegelbot please test @ generoso
CORE_CNT=16

@Flamefire
Copy link
Copy Markdown
Contributor Author

Test report by @Flamefire
SUCCESS
Build succeeded for 107 out of 107 (4 easyconfigs in total)
taurusi8002 - Linux centos linux 7.9.2009, x86_64, AMD EPYC 7352 24-Core Processor (zen2), Python 2.7.5
See https://gist.github.com/1a889a18ba0f97688f866fed71de3159 for a full test report.

@boegelbot
Copy link
Copy Markdown
Collaborator

@boegel: Request for testing this PR well received on generoso

PR test command 'EB_PR=13700 EB_ARGS= /apps/slurm/default/bin/sbatch --job-name test_PR_13700 --ntasks="16" ~/boegelbot/eb_from_pr_upload_generoso.sh' executed!

  • exit code: 0
  • output:
Submitted batch job 18085

Test results coming soon (I hope)...

Details

- notification for comment with ID 896829156 processed

Message to humans: this is just bookkeeping information for me,
it is of no use to you (unless you think I have a bug, which I don't).

@Flamefire
Copy link
Copy Markdown
Contributor Author

Test report by @Flamefire
SUCCESS
Build succeeded for 1 out of 1 (1 easyconfigs in total)
taurusi8009 - Linux centos linux 7.9.2009, x86_64, AMD EPYC 7352 24-Core Processor (zen2), Python 2.7.5
See https://gist.github.com/ebb0e1dec2e7af260ef4c0838035fb2b for a full test report.

@boegelbot
Copy link
Copy Markdown
Collaborator

Test report by @boegelbot
SUCCESS
Build succeeded for 4 out of 4 (4 easyconfigs in total)
generoso-x-1 - Linux centos linux 8.2.2004, x86_64, Intel(R) Xeon(R) CPU E5-2667 v3 @ 3.20GHz (haswell), Python 3.6.8
See https://gist.github.com/43f9a845389f7023b15c598bfd9be3df for a full test report.

@Flamefire
Copy link
Copy Markdown
Contributor Author

@boegel full rebuild of 1 EC on one of the problematic nodes succeded -> GTG

@boegel
Copy link
Copy Markdown
Member

boegel commented Aug 12, 2021

@Flamefire The bot rebuild all 4, so definitely OK to go, thanks!

@boegel
Copy link
Copy Markdown
Member

boegel commented Aug 12, 2021

Going in, thanks @Flamefire!

@boegel boegel merged commit a2331cd into easybuilders:develop Aug 12, 2021
@Flamefire Flamefire deleted the 20210811120129_new_pr_PyTorch181 branch August 12, 2021 09:44
terjekv added a commit to terjekv/easybuild-easyconfigs that referenced this pull request Aug 12, 2021
@boegel boegel modified the milestones: 4.x, next release (4.4.2?) Aug 13, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants