Skip to content

remove log handlers that were added during tests, to ensure effective cleanup of log files#1282

Merged
boegel merged 6 commits intoeasybuilders:developfrom
boegel:trim_tests_tmp_disk_space
May 14, 2015
Merged

remove log handlers that were added during tests, to ensure effective cleanup of log files#1282
boegel merged 6 commits intoeasybuilders:developfrom
boegel:trim_tests_tmp_disk_space

Conversation

@boegel
Copy link
Copy Markdown
Member

@boegel boegel commented May 14, 2015

Although each test is properly cleaning up the tmp dir that was assigned to it in tearDown, the disk space used by the log files was not being freed because of open file handles on the log files.

I looked into properly cleaning up those file handles to trigger the disk space being freed, but this proved to be far from trivial (I concluded this after quite a couple of hours of digging around, accompanied with lots of swearing).

Alternative solution (thanks to @stdweird for the tip): quickly rotate the log files, so that only small bits (max 1MB per log) are left behind in the overall tmp directory used by the running test suite.

This effectively trims the required disk space in /tmp (or equivalent) down from ~3GB to less than 100MB.

@hpcugentbot
Copy link
Copy Markdown

Refer to this link for build results (access rights to CI server needed):
https://jenkins1.ugent.be/job/easybuild-framework-pr-builder/1703/
Test FAILed.

@boegel
Copy link
Copy Markdown
Member Author

boegel commented May 14, 2015

Not out of the woods yet:

Traceback (most recent call last):
  File "/var/lib/jenkins/.local/lib/python2.6/logging/handlers.py", line 77, in emit
    self.doRollover()
  File "/var/lib/jenkins/.local/lib/python2.6/logging/handlers.py", line 137, in doRollover
    self.stream = self._open()
  File "/var/lib/jenkins/.local/lib/python2.6/logging/__init__.py", line 846, in _open
    stream = open(self.baseFilename, self.mode)
IOError: [Errno 2] No such file or directory: '/var/tmp/eb-SFKPKF/eb-RIx5GT/eb-MZB8sc/eb-26IEHo/eb-cCFCzg/easybuild-toy-0.0-20150514.162546.AjYeE.log'
Traceback (most recent call last):
  File "/var/lib/jenkins/.local/lib/python2.6/logging/handlers.py", line 76, in emit
    if self.shouldRollover(record):
  File "/var/lib/jenkins/.local/lib/python2.6/logging/handlers.py", line 150, in shouldRollover
    self.stream.seek(0, 2)  #due to non-posix-compliant Windows feature
ValueError: I/O operation on closed file

@boegel boegel changed the title limit disk space used by tests in tmp dir by quickly rotating log files limit disk space used by tests in tmp dir by quickly rotating log files (WIP) May 14, 2015
@boegel
Copy link
Copy Markdown
Member Author

boegel commented May 14, 2015

Hmm, problems mentioned above seemed to be caused by a known bug, see http://bugs.python.org/issue10949 (Jenkins is running Python 2.6).

@hpcugentbot
Copy link
Copy Markdown

Refer to this link for build results (access rights to CI server needed):
https://jenkins1.ugent.be/job/easybuild-framework-pr-builder/1704/
Test FAILed.

@boegel boegel changed the title limit disk space used by tests in tmp dir by quickly rotating log files (WIP) remove log handlers that were added during tests, to ensure effective cleanup of log files May 14, 2015
@boegel
Copy link
Copy Markdown
Member Author

boegel commented May 14, 2015

OK, found the actual culprit of the log files not being removed: the log handlers were not being cleaned up...

Simple fix, huge impact.

@boegel
Copy link
Copy Markdown
Member Author

boegel commented May 14, 2015

with this patch: Ran 284 tests in 313.337s

without this patch: Ran 284 tests in 907.145s

@wpoely86
Copy link
Copy Markdown
Member

looks fine

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this can go because we restore the original env anyway, right?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, no need to treat this separately

@hpcugentbot
Copy link
Copy Markdown

Refer to this link for build results (access rights to CI server needed):
https://jenkins1.ugent.be/job/easybuild-framework-pr-builder/1705/
Test PASSed.

@pforai
Copy link
Copy Markdown
Contributor

pforai commented May 14, 2015

@boegel can you explain why this makes such a huge difference in performance?!

@boegel
Copy link
Copy Markdown
Member Author

boegel commented May 14, 2015

@pforai: Not removing the log handlers means that they stay in place, and also keep getting used. Every time main() is called from the tests, one or two handlers are added.
If these don't get removed, they will be used every time a log method is called (even though the log file they are supposed to write to is already gone, or at least the inode pointing to the data in the file is).

Near the end of the test suite, over 100 handlers were still active, all of which were still writing to disk (even though the inode of the log file was already removed via shutil.rmtree(self.test_prefix) in tearDown).

Before this patch, about 3GB of disk space was required in /tmp because the log files of previous tests were still being written to. Now, about 100MB is more than enough...

That's actually was caused us diving into this: /tmp on the Jenkins server is quite limited (~6GB free space), and two framework test suites running at the same was enough to fill up /tmp and let all things go to hell. Quite annoying (since all subsequent tests, including PR tests, failed hard then), and I was (finally) fed up with it, which motivated me to get to the bottom of this. ;-)

Some other changes were also done in this PR, i.e. not reconfiguring in tearDown, only to then reconfigure again in setUp of the next step. That helps too, but just a little bit.

@boegel
Copy link
Copy Markdown
Member Author

boegel commented May 14, 2015

Time to merge this in, and forget about it.

Thanks for the review @wpoely86, and @stdweird for helping to figure this out (a bit ;-)).

boegel added a commit that referenced this pull request May 14, 2015
remove log handlers that were added during tests, to ensure effective cleanup of log files
@boegel boegel merged commit c1afc30 into easybuilders:develop May 14, 2015
@boegel boegel deleted the trim_tests_tmp_disk_space branch May 14, 2015 18:30
@stdweird
Copy link
Copy Markdown
Contributor

@boegel nice!

boegel added a commit to boegel/easybuild-framework that referenced this pull request May 14, 2015
…space

remove log handlers that were added during tests, to ensure effective cleanup of log files
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants