Skip to content

Don't empty dbcache on prune flushes: >30% faster IBD#28280

Merged
achow101 merged 14 commits intobitcoin:masterfrom
andrewtoth:sync-dirty
Aug 8, 2024
Merged

Don't empty dbcache on prune flushes: >30% faster IBD#28280
achow101 merged 14 commits intobitcoin:masterfrom
andrewtoth:sync-dirty

Conversation

@andrewtoth
Copy link
Contributor

@andrewtoth andrewtoth commented Aug 16, 2023

Since #17487 we no longer need to clear the coins cache when syncing to disk. A warm coins cache significantly speeds up block connection, and only needs to be fully flushed when nearing the dbcache limit.

For frequent pruning flushes there's no need to empty the cache and kill connect block speed. However, simply using Sync in place of Flush actually slows down a pruned full IBD with a high dbcache value. This is because as the cache grows, sync takes longer since every coin in the cache is scanned to check if it's dirty. For frequent prune flushes and a large cache this constant scanning starts to really slow IBD down, and just emptying the cache on every prune becomes faster.

To fix this, we can add two pointers to each cache entry and construct a doubly linked list of dirty entries. We can then only iterate through all dirty entries on each Sync, and simply clear the pointers after.

With this approach a full IBD with dbcache=16384 and prune=550 was 32% faster than master. For default dbcache=450 speedup was ~9%. All benchmarks were run with stopatheight=800000.

prune dbcache time max RSS speedup
master 550 16384 8:52:57 2,417,464k -
branch 550 16384 6:01:00 16,216,736k 32%
branch 550 450 8:05:08 2,818,072k 8.8%
master 10000 5000 8:19:59 2,962,752k -
branch 10000 5000 5:56:39 6,179,764k 28.8%
master 0 16384 4:51:53 14,726,408k -
branch 0 16384 4:43:11 16,526,348k 2.7%
master 0 450 7:08:07 3,005,892k -
branch 0 450 6:57:24 3,013,556k 2.6%

While the 2 pointers add memory to each cache entry, it did not slow down IBD. For non-pruned IBD results were similar for this branch and master. When I performed the initial IBD, the full UTXO set could be held in memory when using the max dbcache value. For non-pruned IBD with max dbcache to tip ended up using 12% more memory, but it was also 2.7% faster somehow. For smaller dbcache values the dbcache limit is respected so does not consume more memory, and the potentially more frequent flushes were not significant enough to cause any slowdown.

For reviewers, the commits in order do the following:
First 4 commits encapsulate all accesses to flags on cache entries, and then the 5th makes flags private.
Commits refactor: add CoinsCachePair alias to coins: call ClearFlags in CCoinsCacheEntry destructor create the linked list head nodes and cache entry self references and pass them into AddFlags.
Commit coins: track flagged cache entries in linked list actually adds the entries into a linked list when they are flagged DIRTY or FRESH and removes them from the linked list when they are destroyed or the flags are cleared manually. However, the linked list is not yet used anywhere.
Commit test: add cache entry linked list tests adds unit tests for the linked list.
Commit coins: pass linked list of flagged entries to BatchWrite uses the linked list to iterate through DIRTY entries instead of using the entire coins cache.
Commit validation: don't erase coins cache on prune flushes uses Sync instead of Flush for pruning flushes, so the cache is no longer cleared.

Inspired by this comment.

Fixes #11315.

@DrahtBot
Copy link
Contributor

DrahtBot commented Aug 16, 2023

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage

For detailed information about the code coverage, see the test coverage report.

Reviews

See the guideline for information on the review process.

Type Reviewers
ACK paplorinc, sipa, achow101, mzumsande
Concept ACK jamesob, hernanmarino, vostrnad

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #30326 (optimization: Reduce cache lookups in CCoinsViewCache::FetchCoin by paplorinc)
  • #28216 (fuzz: a new target for the coins database by darosior)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

@jamesob
Copy link
Contributor

jamesob commented Aug 24, 2023

Concept ACK! I'll try to repro the bench results in the next week or so.

@0xB10C
Copy link
Contributor

0xB10C commented Aug 30, 2023

@andrewtoth Not sure if you have seen #20827 which is a different approach but has a similar effect.

@andrewtoth
Copy link
Contributor Author

andrewtoth commented Aug 30, 2023

@0xB10C indeed I have seen that one. I believe they complement each other. #20827 reduces IBD time for larger prune values and this PR reduces it even more for smaller prune values. I should benchmark this on top of that one.
Also, I'm curious how you are performing your benchmarks and generating all those nice graphs. Do you mind sharing what tools you use?

@0xB10C
Copy link
Contributor

0xB10C commented Aug 31, 2023

Also, I'm curious how you are performing your benchmarks and generating all those nice graphs. Do you mind sharing what tools you use?

At the time I had a spare and otherwise idle machine set up I could use. I don't have the bash script at hand but IIRC I had a merge base (MB) and a pull-request (PR) binary I'd run with different prune and dbcache configurations. After each run, the script would rename and move the debug log (with extra -debug=coindb -debug=prune) to a separate folder. After all runs were done, I loaded the debug logs into the following Jupyter notebooks to parse the debug logs and to generate the graphs. For anyone revisiting this, these are the relevant comments #20827 (comment), #20827 (comment), and #20827 (comment) for my setup and results.

I've uploaded the jupyter nodebooks here https://gist.github.com/0xB10C/89c2903290cfb1840792d41dcd079646. The first was used for a partial IBD from 500k-600k and the second one with a full IBD. Both times syncing from a local node on the same host. The notebooks expect the debug log files with the name debug_{binary}_{cset}_{run}.log where {binary} is either MB or PR, {cset} is an integer specifying the configuration set (pruning and dbcache parameter), and {run} is an integer numbering the run of this binary-cset combination if I did multiple. Feel free to remix and share in any way you want. Keep in mind that parsing logs is quite fragile as log message format can change. Feel free to reach out if there are questions.

@jamesob
Copy link
Contributor

jamesob commented Sep 20, 2023

Bench running now, should have some results in the next day or so.

$ ( bitcoinperf --no-teardown bench-pr --num-blocks 30_000 --run-id no-flush-on-prune \
    --bitcoind-args='-dbcache=4000 -prune=550' --run-count 3 28280 && \
    pushover 'Bench for andrewtoth SUCCEEDED!' ) || \
  pushover 'Bench for andrewtoth failed'

@jamesob
Copy link
Contributor

jamesob commented Sep 27, 2023

Local IBD of 30_000 blocks with -prune=550 from a Taproot-enabled height didn't show any difference. I didn't have -log=prune enabled, so my bench debug.log was basically useless. Rerunning...

@sipa
Copy link
Member

sipa commented Aug 7, 2024

reACK 589db87

@achow101
Copy link
Member

achow101 commented Aug 7, 2024

ACK 589db87

Copy link
Contributor

@mzumsande mzumsande left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re-ACK 589db87

@andrewtoth
Copy link
Contributor Author

🎉 Thanks everyone!

@hebasto
Copy link
Member

hebasto commented Aug 10, 2024

Ported to the CMake-based build system in hebasto#318.

@l0rinc l0rinc mentioned this pull request Jan 14, 2026
7 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Prune undermines the dbcache.