Reverse cs_main, cs_wallet lock order and reduce cs_main locking#16426
Reverse cs_main, cs_wallet lock order and reduce cs_main locking#16426maflcko merged 5 commits intobitcoin:masterfrom
Conversation
|
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. |
|
Great work! I'd suggest changing the PR title to something like "Reverse cs_main, cs_wallet lock order and reduce cs_main locking" to motivate it better and describe the change. The current title "Remove Chain::Lock interface" and starting sentence "Follow-up in the set of Chain interface refactoring" make this sound mostly like a refactoring. But this is more of a locking change, and a change to make the wallet more asynchronous and event driven than a refactoring change. |
|
Concept ACK Excellent work! |
|
Concept ACK. This might help IBD, because |
|
Big concept ACK! Thanks @ariard |
|
Concept ACK, mother of god, not locking |
src/wallet/wallet.cpp
Outdated
There was a problem hiding this comment.
Instead of adding an interfaces::Chain::checkFinalTx method, it might be better to call IsFinalTx directly with the wallet's height and time, avoiding going through CheckFinalTx.
This could perform better since it wouldn't require locking cs_main, and could make wallet calls return more internally consistent information when the wallet is catching up to the chain.
There was a problem hiding this comment.
re: #16426 (comment)
Instead of adding an
interfaces::Chain::checkFinalTxmethod, it might be better to callIsFinalTx
#17443 is a step in this direction (#17443 (review))
a052c37 to
f057aed
Compare
|
Finally rebased after merge of #15931, PR should be ready for review. Apart of 3efe38d which use the new If you still feel PR is hard to review, I can subsplit easily in another PR. |
|
Approach ACK. Code changes here are very simple after #15931. All the changes before the last commit f057aedf80d8bd6083c2227e42d4887be4c2933b seem straightforward and don't change behavior other than locking Only the last commit f057aedf80d8bd6083c2227e42d4887be4c2933b is the big scary change that removes I'll review this PR more closely this week. It might be nice to simplify the PR description now that #15931 is merged. I think the description just basically needs to say that |
|
Restating my concept ACK. I plan to review this fully soon. Thanks for rebasing this so quickly |
f057aed to
8aea39a
Compare
Most of the code making assumptions about the tip is confined in the rescan logic, so I think this one should get the focus, you can grep for all methods fetching tip like |
7ef1044 to
0bdbc43
Compare
0bdbc43 to
f59c5c2
Compare
|
Concept ACK |
|
All good, I agree! |
7918c1b test: Add CreateWalletFromFile test (Russell Yanofsky) Pull request description: Add unit test calling CreateWalletFromFile, which isn't currently called from other unit tests, with some basic checks to make sure it rescans and registers for notifications correctly. Motivation for this change was to try to write a test that would fail without the early `handleNotifications` call in ef8c6ca from bitcoin#16426, but succeed with it: https://github.com/bitcoin/bitcoin/blob/ef8c6ca60767cac589d98ca57ee33179608ccda8/src/wallet/wallet.cpp#L3978-L3986 However, writing a full test for the race condition that call prevents isn't possible without the locking changes from bitcoin#16426. So this PR just adds as much test coverage as is possible now. This new test is also useful for bitcoin#15719, since it detects the stale notifications.transactionAddedToMempool notifications that PR eliminates. ACKs for top commit: MarcoFalke: ACK 7918c1b jonatack: ACK 7918c1b Tree-SHA512: 44035aee698ecb722c6039d061d8fac2011e9da0b314e4aff19be1d610b53cacff99016b34d6b84669bb3b61041b2318d9d8e3363658f087802ae4aa36ca17b8
@ryanofsky missed this, AFAICT this doesn't add new issues to the GUI. |
Upstream bitcoin/bitcoin#16426 reversed the lock order between the wallet and cs_main. In the name code, we had to do the same updates in some places.
Move wallet startup code closer to a simple model where the wallet attaches to the chain with a single chain.handleNotifications() call, and just starts passively receiving blocks and mempool notifications from the last update point, instead having to actively rescan blocks and request a mempool snapshot, and deal with the tip changing, and deal with stale and out of order notifications. Also, stop locking the cs_wallet mutex and registering for validationinterface notifications before the rescan. This is new behavior since 6a72f26 bitcoin#16426 and is not ideal because it stops other wallets and rpcs and the gui from receiving new notifications until after the scan completes. This change is a half-step towards implementing multiwallet parallel scans (bitcoin#11756), since it provides needed locator and birthday timestamp information to the Chain interface, and it rationalizes locking and event ordering in the startup code. The second half of implementing parallel rescans requires moving the ScanForWalletTransactions implementation (which this PR does not touch) from the wallet to the node.
Move wallet startup code closer to a simple model where the wallet attaches to the chain with a single chain.handleNotifications() call, and just starts passively receiving blocks and mempool notifications from the last update point, instead having to actively rescan blocks and request a mempool snapshot, and deal with the tip changing, and deal with early or stale notifications. Also, stop locking the cs_wallet mutex and registering for validationinterface notifications before the rescan. This was new behavior since 6a72f26 bitcoin#16426 and is not ideal because it stops other wallets and rpcs and the gui from receiving new notifications until after the scan completes. This change is a half-step towards implementing multiwallet parallel scans (bitcoin#11756), since it provides needed locator and birthday timestamp information to the Chain interface, and it rationalizes locking and event ordering in the startup code. The second half of implementing parallel rescans requires moving the ScanForWalletTransactions implementation (which this PR does not touch) from the wallet to the node.
Move wallet startup code closer to a simple model where the wallet attaches to the chain with a single chain.handleNotifications() call, and just starts passively receiving blocks and mempool notifications from the last update point, instead having to actively rescan blocks and request a mempool snapshot, and deal with the tip changing, and deal with stale and out of order notifications. Also, stop locking the cs_wallet mutex and registering for validationinterface notifications before the rescan. This is new behavior since 6a72f26 bitcoin#16426 and is not ideal because it stops other wallets and rpcs and the gui from receiving new notifications until after the scan completes. This change is a half-step towards implementing multiwallet parallel scans (bitcoin#11756), since it provides needed locator and birthday timestamp information to the Chain interface, and it rationalizes locking and event ordering in the startup code. The second half of implementing parallel rescans requires moving the ScanForWalletTransactions implementation (which this PR does not touch) from the wallet to the node.
Move wallet startup code closer to a simple model where the wallet attaches to the chain with a single chain.handleNotifications() call, and just starts passively receiving blocks and mempool notifications from the last update point, instead having to actively rescan blocks and request a mempool snapshot, and deal with the tip changing, and deal with early or stale notifications. Also, stop locking the cs_wallet mutex and registering for validationinterface notifications before the rescan. This was new behavior since 6a72f26 bitcoin#16426 and is not ideal because it stops other wallets and rpcs and the gui from receiving new notifications until after the scan completes. This change is a half-step towards implementing multiwallet parallel scans (bitcoin#11756), since it provides needed locator and birthday timestamp information to the Chain interface, and it rationalizes locking and event ordering in the startup code. The second half of implementing parallel rescans requires moving the ScanForWalletTransactions implementation (which this PR does not touch) from the wallet to the node.
Move wallet startup code closer to a simple model where the wallet attaches to the chain with a single chain.handleNotifications() call, and just starts passively receiving blocks and mempool notifications from the last update point, instead having to actively rescan blocks and request a mempool snapshot, and deal with the tip changing, and deal with early or stale notifications. Also, stop locking the cs_wallet mutex and registering for validationinterface notifications before the rescan. This was new behavior since 6a72f26 bitcoin#16426 and is not ideal because it stops other wallets and rpcs and the gui from receiving new notifications until after the scan completes. This change is a half-step towards implementing multiwallet parallel scans (bitcoin#11756), since it provides needed locator and birthday timestamp information to the Chain interface, and it rationalizes locking and event ordering in the startup code. The second half of implementing parallel rescans requires moving the ScanForWalletTransactions implementation (which this PR does not touch) from the wallet to the node.
This just drops two interfaces::Chain methods replacing them with other calls, not changing behavior in any way. Motivation was - Need to get rid of findFirstBlockWithTimeAndHeight for bitcoin#10102 - Followup from bitcoin#16426 (comment) - phantomcircuit comments about findNextBlock test http://www.erisian.com.au/bitcoin-core-dev/log-2020-06-06.html#l-214
This just drops two interfaces::Chain methods replacing them with other calls. Motivation for removing these chain methods: - Need to get rid of findFirstBlockWithTimeAndHeight for bitcoin#10102, which doesn't support overloaded methods - Followup from bitcoin#16426 (comment) - phantomcircuit comments about findNextBlock test http://www.erisian.com.au/bitcoin-core-dev/log-2020-06-06.html#l-214 Behavior is not changing in any way here. A TODO comment in ScanForWalletTransactions was removed, but just because it was invalid (see bitcoin#19195 (comment)), not because it was implemented.
Move wallet startup code closer to a simple model where the wallet attaches to the chain with a single chain.handleNotifications() call, and just starts passively receiving blocks and mempool notifications from the last update point, instead having to actively rescan blocks and request a mempool snapshot, and deal with the tip changing, and deal with early or stale notifications. Also, stop locking the cs_wallet mutex and registering for validationinterface notifications before the rescan. This was new behavior since 6a72f26 bitcoin#16426 and is not ideal because it stops other wallets and rpcs and the gui from receiving new notifications until after the scan completes. This change is a half-step towards implementing multiwallet parallel scans (bitcoin#11756), since it provides needed locator and birthday timestamp information to the Chain interface, and it rationalizes locking and event ordering in the startup code. The second half of implementing parallel rescans requires moving the ScanForWalletTransactions implementation (which this PR does not touch) from the wallet to the node.
This just drops two interfaces::Chain methods replacing them with other calls. Motivation for removing these chain methods: - Need to get rid of findFirstBlockWithTimeAndHeight for bitcoin#10102, which doesn't support overloaded methods - Followup from bitcoin#16426 (comment) - phantomcircuit comments about findNextBlock test http://www.erisian.com.au/bitcoin-core-dev/log-2020-06-06.html#l-214 Behavior is not changing in any way here. A TODO comment in ScanForWalletTransactions was removed, but just because it was invalid (see bitcoin#19195 (comment)), not because it was implemented.
Move wallet startup code closer to a simple model where the wallet attaches to the chain with a single chain.handleNotifications() call, and just starts passively receiving blocks and mempool notifications from the last update point, instead having to actively rescan blocks and request a mempool snapshot, and deal with the tip changing, and deal with early or stale notifications. Also, stop locking the cs_wallet mutex and registering for validationinterface notifications before the rescan. This was new behavior since 6a72f26 bitcoin#16426 and is not ideal because it stops other wallets and rpcs and the gui from receiving new notifications until after the scan completes. This change is a half-step towards implementing multiwallet parallel scans (bitcoin#11756), since it provides needed locator and birthday timestamp information to the Chain interface, and it rationalizes locking and event ordering in the startup code. The second half of implementing parallel rescans requires moving the ScanForWalletTransactions implementation (which this PR does not touch) from the wallet to the node.
Move wallet startup code closer to a simple model where the wallet attaches to the chain with a single chain.handleNotifications() call, and just starts passively receiving blocks and mempool notifications from the last update point, instead having to actively rescan blocks and request a mempool snapshot, and deal with the tip changing, and deal with early or stale notifications. Also, stop locking the cs_wallet mutex and registering for validationinterface notifications before the rescan. This was new behavior since 6a72f26 bitcoin#16426 and is not ideal because it stops other wallets and rpcs and the gui from receiving new notifications until after the scan completes. This change is a half-step towards implementing multiwallet parallel scans (bitcoin#11756), since it provides needed locator and birthday timestamp information to the Chain interface, and it rationalizes locking and event ordering in the startup code. The second half of implementing parallel rescans requires moving the ScanForWalletTransactions implementation (which this PR does not touch) from the wallet to the node.
Move wallet startup code closer to a simple model where the wallet attaches to the chain with a single chain.handleNotifications() call, and just starts passively receiving blocks and mempool notifications from the last update point, instead having to actively rescan blocks and request a mempool snapshot, and deal with the tip changing, and deal with early or stale notifications. Also, stop locking the cs_wallet mutex and registering for validationinterface notifications before the rescan. This was new behavior since 6a72f26 bitcoin#16426 and is not ideal because it stops other wallets and rpcs and the gui from receiving new notifications until after the scan completes. This change is a half-step towards implementing multiwallet parallel scans (bitcoin#11756), since it provides needed locator and birthday timestamp information to the Chain interface, and it rationalizes locking and event ordering in the startup code. The second half of implementing parallel rescans requires moving the ScanForWalletTransactions implementation (which this PR does not touch) from the wallet to the node.
Move wallet startup code closer to a simple model where the wallet attaches to the chain with a single chain.handleNotifications() call, and just starts passively receiving blocks and mempool notifications from the last update point, instead having to actively rescan blocks and request a mempool snapshot, and deal with the tip changing, and deal with early or stale notifications. Also, stop locking the cs_wallet mutex and registering for validationinterface notifications before the rescan. This was new behavior since 6a72f26 bitcoin#16426 and is not ideal because it stops other wallets and rpcs and the gui from receiving new notifications until after the scan completes. This change is a half-step towards implementing multiwallet parallel scans (bitcoin#11756), since it provides needed locator and birthday timestamp information to the Chain interface, and it rationalizes locking and event ordering in the startup code. The second half of implementing parallel rescans requires moving the ScanForWalletTransactions implementation (which this PR does not touch) from the wallet to the node.
This just drops three interfaces::Chain methods replacing them with other calls. Motivation for removing these chain methods: - Need to get rid of findFirstBlockWithTimeAndHeight for bitcoin#10102, which doesn't support overloaded methods - Followup from bitcoin#16426 (comment) - phantomcircuit comments about findNextBlock test http://www.erisian.com.au/bitcoin-core-dev/log-2020-06-06.html#l-214 Behavior is not changing in any way here. A TODO comment in ScanForWalletTransactions was removed, but just because it was invalid (see bitcoin#19195 (comment)), not because it was implemented.
Move wallet startup code closer to a simple model where the wallet attaches to the chain with a single chain.handleNotifications() call, and just starts passively receiving blocks and mempool notifications from the last update point, instead having to actively rescan blocks and request a mempool snapshot, and deal with the tip changing, and deal with early or stale notifications. Also, stop locking the cs_wallet mutex and registering for validationinterface notifications before the rescan. This was new behavior since 6a72f26 bitcoin#16426 and is not ideal because it stops other wallets and rpcs and the gui from receiving new notifications until after the scan completes. This change is a half-step towards implementing multiwallet parallel scans (bitcoin#11756), since it provides needed locator and birthday timestamp information to the Chain interface, and it rationalizes locking and event ordering in the startup code. The second half of implementing parallel rescans requires moving the ScanForWalletTransactions implementation (which this PR does not touch) from the wallet to the node.
Move wallet startup code closer to a simple model where the wallet attaches to the chain with a single chain.handleNotifications() call, and just starts passively receiving blocks and mempool notifications from the last update point, instead having to actively rescan blocks and request a mempool snapshot, and deal with the tip changing, and deal with early or stale notifications. Also, stop locking the cs_wallet mutex and registering for validationinterface notifications before the rescan. This was new behavior since 6a72f26 bitcoin#16426 and is not ideal because it stops other wallets and rpcs and the gui from receiving new notifications until after the scan completes. This change is a half-step towards implementing multiwallet parallel scans (bitcoin#11756), since it provides needed locator and birthday timestamp information to the Chain interface, and it rationalizes locking and event ordering in the startup code. The second half of implementing parallel rescans requires moving the ScanForWalletTransactions implementation (which this PR does not touch) from the wallet to the node.
|
FWIW, this PR causes the issue #19049. |
This change is intended to make the bitcoin node and its rpc, network and gui interfaces more responsive while the wallet is in use. Currently, because the node's
cs_mainmutex is always locked before the wallet'scs_walletmutex (to prevent deadlocks),cs_maincurrently stays locked while the wallet does relatively slow things like creating and listing transactions.Switching the lock order so
cs_mainis acquired aftercs_walletallowscs_mainto be only locked intermittently while the wallet is doing slow operations, so the node is not blocked waiting for the wallet.To review the present PR, most of getting right the move is ensuring any
LockAssertioninChain::Lockmethod is amended as aLOCK(cs_main). And in final commit, check that any wallet code which was previously locking the chain is now calling a method, enforcing the lock taking job. So far the only exception I found ishandleNotifications, which should be corrected.