Allocate memory more efficiently when packing objects#5170
Allocate memory more efficiently when packing objects#5170ethomson merged 1 commit intolibgit2:masterfrom
Conversation
The packbuilder code allocates memory in chunks. When it needs to allocate, it tries to add 1024 to the number of objects and multiply by 3/2. However, it actually multiplies by 1 instead, since it performs an integral division in the expression "3 / 2" and only then multiplies by the increased number of objects. The current behavior causes the code to waste massive amounts of time copying memory when it reallocates, causing inserting all non-blob objects in the Linux repository into a new pack to take some indeterminate time greater than 5 minutes instead of 52 seconds. Correct this error by first dividing by two, and only then multiplying by 3. We still check for overflow for the multiplication, which is the only part that can overflow. This appears to be the only place in the code base which has this problem.
|
Ouch. Nice catch. That's almost certainly my bad, from when we added the overflow bits, my rather mechanical transformations screwed this up. :/ |
Oh also, to clarify: with this PR it takes 52 seconds to repack? Or 52 seconds is the ideal git.git implementation time? |
With this PR, just inserting the non-blob objects (i.e., calling |
👍 Just wanted to make sure that I understood correctly. Thanks for the fix. |
The packbuilder code allocates memory in chunks. When it needs to allocate, it tries to add 1024 to the number of objects and multiply by 3/2. However, it actually multiplies by 1 instead, since it performs an integral division in the expression "3 / 2" and only then multiplies by the increased number of objects.
The current behavior causes the code to waste massive amounts of time copying memory when it reallocates, causing inserting all non-blob objects in the Linux repository into a new pack to take some indeterminate time greater than 5 minutes instead of 52 seconds. (I was impatient and didn't want to wait much longer than 5 minutes for my job to run.)
Correct this error by first dividing by two, and only then multiplying by 3. We still check for overflow for the multiplication, which is the only part that can overflow. This appears to be the only place in the code base which has this problem.
There are no tests for this because the only consequence of this change is a performance improvement: the old code would have been equally functional, just much slower.