wallet: cache IsMine scriptPubKeys to improve performance of descriptor wallets#26008
Conversation
4aa65c3 to
23dffde
Compare
23dffde to
34590dd
Compare
|
Dropped the signing changes as they were causing issues with being able to sign transactions that we have the keys for but not the descriptors. #23417 would allow us to resolve the performance issues for signing while retaining this behavior. |
|
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. Code CoverageFor detailed information about the code coverage, see the test coverage report. ReviewsSee the guideline for information on the review process.
If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update. ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. |
34590dd to
d4cdc2c
Compare
8945d10 to
020f5b8
Compare
src/wallet/wallet.cpp
Outdated
There was a problem hiding this comment.
| m_uncached_spkms.insert(spkm); | |
| if (!ranged) { | |
| for (const auto& script : spkm->GetScriptPubKeys()) { | |
| m_cached_spks[script].insert(spkm); | |
| m_uncached_spkms.erase(spkm); | |
| } | |
| } | |
| if (!ranged) { | |
| for (const auto& script : spkm->GetScriptPubKeys()) { | |
| m_cached_spks[script].insert(spkm); | |
| } | |
| } | |
| else { | |
| m_uncached_spkms.insert(spkm); | |
| } |
There was a problem hiding this comment.
I prefer the way it was written as it ensures that no spkm will be accidentally missed.
src/bench/wallet_balance.cpp
Outdated
There was a problem hiding this comment.
What would be the reason to remove this validation ?
There was a problem hiding this comment.
It is incorrect, unneeded, and caused a segfault.
It is incorrect because LoadWallet is supposed to be called before SetupDescriptorScriptPubKeyMans. Doing it after will break things that SetupDescriptorScriptPubKeyMans sets, which also caused a segfault. It is also entirely unneeded as loading an empty wallet doesn't do anything, and SetupDescriptorScriptPubKeyMans is doing all of the setup needed.
020f5b8 to
f9f467b
Compare
src/wallet/wallet.h
Outdated
There was a problem hiding this comment.
I think the names below work better and make the purpose of the variable clearer.
| std::unordered_map<CScript, std::unordered_set<ScriptPubKeyMan*>, SaltedSipHasher> m_cached_spks; | |
| //! Set of which descriptors are not in m_cached_spks | |
| std::unordered_set<ScriptPubKeyMan*> m_uncached_spkms; | |
| std::unordered_map<CScript, std::unordered_set<ScriptPubKeyMan*>, SaltedSipHasher> m_ranged_spks; | |
| //! Set of which descriptors are not in m_cached_spks | |
| std::unordered_set<ScriptPubKeyMan*> m_non_ranged_spkms; |
There was a problem hiding this comment.
I prefer describing the variable with how we expect to use it, rather than what it contains. We're using it as a cache of spkms, the type of spkm doesn't matter.
w0xlt
left a comment
There was a problem hiding this comment.
This change looks good, but adds complexity by adding two caches instead of accessing m_spk_managers directly.
Perhaps the PR description might include benchmark improvement as a reason for the added complexity.
f9f467b to
fd1b75e
Compare
fd1b75e to
9b248d1
Compare
|
reACK a966473 |
furszy
left a comment
There was a problem hiding this comment.
Left two more comments. They aren't really blocking but I'm still struggling with the memory consumption topic. Isn't std::unordered_map going to store all keys (entire scripts) in memory for duplicated so it can solve hash collisions?
Apart from that, just as an extra note for the future: It seems that we will need to rearrange code so that these functions aren't called so many times within the same process.
|
Tested Before: After: |
a966473 to
e7c141b
Compare
Yes. I've done some memory usage profiling with a large wallet, and various combinations of ordered and unordered maps and sets. The baseline memory usage of
The benchmark indicates that there isn't really a meaningful difference between all of these approaches. It seems that not using In the latest push, I've changed to using |
e7c141b to
b20d882
Compare
|
🚧 At least one of the CI tasks failed. Make sure to run all tests locally, according to the Possibly this is due to a silent merge conflict (the changes in this pull request being Leave a comment here, if you need help tracking down a confusing failure. |
|
I re-checked b20d882fd53c21098eb7858b2dbca84eafd2b344 and re-confirmed |
There was a problem hiding this comment.
Code review ACK b20d882fd53c21098eb7858b2dbca84eafd2b344. I don't have much insight into performance impact of this change, but it seems like it should make migrated legacy wallets much faster (10 times faster loading in one case: #26008 (comment)) even if it does increase memory usage, and it seems to be implemented correctly.
re: #26008 (review)
Apart from that, just as an extra note for the future: It seems that we will need to rearrange code so that these functions aren't called so many times within the same process.
I'm not sure what this is referring to, so would be curious
|
I've also added one more commit that speeds up loading and migration as I noticed one spot where |
|
ACK ac246f6 |
furszy
left a comment
There was a problem hiding this comment.
Reviewed. Need to update the second commit description.
After TopUp completes, the wallet containing each SPKM will want to know what new scriptPubKeys were generated. In order for all TopUp calls (including ones internal the the SPKM), we use a callback function in the WalletStorage interface.
Have CWallet maintain a cache of all known scriptPubKeys for its DescriptorSPKMs in order to improve performance of the functions that require searching for scriptPubKeys.
Instead of iterating m_spk_managers a DescriptorSPKM has been loaded in order to get it's ID to compare, have LoadDescriptorSPKM return a reference to the loaded DescriptorSPKM so it can be queried directly.
Done |
@ryanofsky. |
furszy
left a comment
There was a problem hiding this comment.
Code review ACK e041ed9
It isn't blocking, but I have to admit that I'm still not really happy with the doubled script storage. I think we can do better. Will be experimenting with possible solutions.
A first measure to decrease it and remain fast, without changing the structure, could be to use the cache only for the inactive SPKMs. Since the active ones are limited in number (max 8), they can be checked quickly. However, this approach does not address the issue of handling really large scripts, which could probably be resolved by changing the structure for a probabilistic one.
Wallets that have a ton of non-ranged descriptors (such as a migrated non-HD wallet) perform fairly poorly due to looping through all of the wallet's
ScriptPubKeyMans. This is done in various places, such asIsMine, and helper functions for fetching aScriptPubKeyManand aSolvingProvider. This also has a bit of a performance impact on standard descriptor wallets, although less noticeable due to the small number of SPKMs.As these functions are based on doing
IsMinefor eachScriptPubKeyMan, we can improve this performance by cachingIsMinescriptPubKeys for all descriptors and use that to determine whichScriptPubKeyManto actually use for those things. This cache is used exclusively and we no longer iterate the SPKMs.Also added a benchmark for
IsMine.