6% faster, 6% less memory: more tightly pack CCoinMap::value_type#16970
Closed
martinus wants to merge 1 commit intobitcoin:masterfrom
Closed
6% faster, 6% less memory: more tightly pack CCoinMap::value_type#16970martinus wants to merge 1 commit intobitcoin:masterfrom
martinus wants to merge 1 commit intobitcoin:masterfrom
Conversation
This change reduces CCoinMap's value_type from 96 bytes to 80 bytes by more tightly packing it's data. This is achieved by these changes: * Refactored prevector so it uses a single byte to determine its size when in direct mode * Reduce CScriptBase from 28 to 27 indirect bytes * align CTxOut to 4 bytes to prevent padding when used as a member in Coin This tighter packing means more data can be stored in the coinsCache before it is full and has to be flushed to disk. In my benchmark, -reindex-chainstate was 6% faster and used 6% less memory. The cache could fit 14% more txo's before it had to resize.
TheBlueMatt
reviewed
Sep 26, 2019
| * CAmount, and 28 for CScript. As a member of Coin, this leaves 4 bytes for | ||
| * Coin's nHeight without the need for any padding. | ||
| */ | ||
| #pragma pack(push, 4) |
Contributor
There was a problem hiding this comment.
Does this break on oldschool 32-bit ARM where int reads are required to be aligned (and I presume this includes 64-bit ints?)
Contributor
Author
There was a problem hiding this comment.
I don't think it's a problem, we are using uint32_t everywhere and align to 4 byte, so it should be fine
Contributor
There was a problem hiding this comment.
Hmm? Isn't nValue an 8-bit integer? Or am I misunerstanding what this does?
Contributor
Author
There was a problem hiding this comment.
yes its 8byte, but I thought it would still be ok because of how it's used in Coin. I guess I've missed something, I really need to set up travis builds... I'll close this for now until I figure out if/how I can do this correctly
Member
unit tests fail https://travis-ci.org/bitcoin/bitcoin/jobs/590073788#L3601 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This change reduces
CCoinMap::value_typefrom 96 bytes to 80 bytes by more tightly packing it's data. This allows to cache a lot more transactions in the cache. I've achieved this with these changes:prevectorso it uses a single byte to determine its size when in direct modeCScriptBasefrom 28 to 27 indirect bytesCTxOutto 4 bytes to prevent padding when used as a member inCoinThis tighter packing means more data can be stored in the
coinsCachebefore it is full and has to be flushed to disk. In my benchmark,-reindex-chainstatewas 6% faster and used 6% less memory. The cache could fit 14% more txo's before it had to resize.Some numbers:
The graph shows nicely that the cache is able to be used a lot longer before it is full and flushed to the disk:

I've tried this change in combination with #16957 to not cache hash), and then additionally in combination with #16801 :
With all 3 PR's applied, reindex was 14% faster and used 20% less memory