fix: serialize TrySignChainTip to prevent concurrent signing race#7209
Conversation
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Repository UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
🚧 Files skipped from review as they are similar to previous changes (1)
WalkthroughThe pull request adds a new private mutable mutex Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes 🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
src/chainlock/signing.h (1)
80-81: Encode::cs_mainin the lock contract.
TrySignChainTip()reacquires::cs_mainat Line 88 insrc/chainlock/signing.cpp, so callers already need to enter with::cs_mainunlocked. Adding!::cs_mainhere would make that precondition—and the intendedcs_try_sign -> ::cs_main -> cs_signerorder—machine-checkable.🔒 Suggested annotation tweak
void UpdatedBlockTip(const CBlockIndex* pindexNew, const CBlockIndex* pindexFork, bool fInitialDownload) override - EXCLUSIVE_LOCKS_REQUIRED(!cs_try_sign, !cs_signer); + EXCLUSIVE_LOCKS_REQUIRED(!::cs_main, !cs_try_sign, !cs_signer); private: - void TrySignChainTip() EXCLUSIVE_LOCKS_REQUIRED(!cs_try_sign, !cs_signer); + void TrySignChainTip() EXCLUSIVE_LOCKS_REQUIRED(!::cs_main, !cs_try_sign, !cs_signer);Also applies to: 89-89
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/chainlock/signing.h` around lines 80 - 81, UpdatedBlockTip's lock annotation omits ::cs_main so the expected lock order (cs_try_sign -> ::cs_main -> cs_signer) isn't machine-checkable; update the EXCLUSIVE_LOCKS_REQUIRED contract on UpdatedBlockTip to include !::cs_main along with !cs_try_sign and !cs_signer so callers must hold cs_try_sign and release ::cs_main before acquiring cs_signer (aligning with TrySignChainTip which reacquires ::cs_main), and ensure the same change is applied to the analogous declaration at the other referenced location.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@src/chainlock/signing.h`:
- Around line 80-81: UpdatedBlockTip's lock annotation omits ::cs_main so the
expected lock order (cs_try_sign -> ::cs_main -> cs_signer) isn't
machine-checkable; update the EXCLUSIVE_LOCKS_REQUIRED contract on
UpdatedBlockTip to include !::cs_main along with !cs_try_sign and !cs_signer so
callers must hold cs_try_sign and release ::cs_main before acquiring cs_signer
(aligning with TrySignChainTip which reacquires ::cs_main), and ensure the same
change is applied to the analogous declaration at the other referenced location.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 5047b082-cae5-495f-bcec-5e365e329de2
📒 Files selected for processing (2)
src/chainlock/signing.cppsrc/chainlock/signing.h
✅ No Merge Conflicts DetectedThis PR currently has no conflicts with other open PRs. |
src/chainlock/signing.cpp
Outdated
|
|
||
| void ChainLockSigner::TrySignChainTip() | ||
| { | ||
| LOCK(cs_try_sign); |
There was a problem hiding this comment.
I think it would make sense to try lock and then if we can't acquire the lock just return here We don't really need to wait and then we run it immediately after This function will get called occasionally anyhow.
There was a problem hiding this comment.
Good call — switched to TRY_LOCK with an early return if the lock can't be acquired. Since TrySignChainTip is called periodically anyway, there's no need to block waiting. Force-pushed.
…ng race TrySignChainTip can be called concurrently from two independent threads: the signer's own scheduler (every 5s) and the CValidationInterface scheduler (via UpdatedBlockTip). The function only briefly locks cs_signer around the lastSignedHeight check before computing the requestId/msgHash and calling AsyncSignIfMember. Under competing tips (short reorg), two concurrent calls can both pass the height check with different tips, update lastSignedHeight/msgHash non-atomically, and issue AsyncSignIfMember with different block hashes for the same height-based requestId. This splits signing shares and can prevent chainlock formation. Add a dedicated cs_try_sign mutex that serializes the entire TrySignChainTip function. Lock ordering: cs_try_sign -> cs_main -> cs_signer (no deadlock risk as cs_try_sign is only acquired here).
b1ce7f9 to
881bbce
Compare
…ent signing race 881bbce fix(chainlock): serialize TrySignChainTip to prevent concurrent signing race (PastaClaw) Pull request description: ## Motivation `TrySignChainTip()` can be called concurrently from two independent threads: 1. The signer's own `m_scheduler` thread (every 5 seconds, via `Start()`) 2. The `CValidationInterface` scheduler thread (via `UpdatedBlockTip()`) The function only briefly locks `cs_signer` around the `lastSignedHeight` check, then releases it to perform chain tip reads, IS lock checks, and eventually calls `AsyncSignIfMember` — only re-acquiring `cs_signer` to update `lastSignedHeight`/`lastSignedRequestId`/`lastSignedMsgHash` right before signing. Under competing tips (short reorg between the two reads of `m_chainstate.m_chain.Tip()`), two concurrent calls can both pass the height check with different tips and issue `AsyncSignIfMember` with different block hashes for the same height-based `requestId`. This splits signing shares across different messages and can prevent a chainlock from forming. ## Fix Add a dedicated `cs_try_sign` mutex that serializes the entire `TrySignChainTip()` function. This ensures only one thread evaluates and signs a tip at a time. Lock ordering: `cs_try_sign` → `cs_main` → `cs_signer` (no deadlock risk as `cs_try_sign` is only acquired at `TrySignChainTip` entry). ## Validation - Confirmed the signer's scheduler and `CValidationInterface` scheduler are independent threads with no cross-synchronization for `TrySignChainTip` - The handler's `tryLockChainTipScheduled` atomic only serializes within the handler scheduler, not across the signer scheduler - Verified lock ordering: `cs_try_sign` is always acquired first, no other code path acquires it, so no deadlock possible - Thread safety annotations updated on both `TrySignChainTip()` and `UpdatedBlockTip()` declarations ## Checklist - [x] I have performed a self-review of my own code - [x] I have commented my code, particularly in hard-to-understand areas - [ ] I have added or updated relevant unit/integration/functional/e2e tests - [ ] I have made corresponding changes to the documentation - [ ] I have assigned this pull request to a milestone ACKs for top commit: PastaPastaPasta: utACK 881bbce UdjinM6: utACK 881bbce Tree-SHA512: b7746f0bf12063702e7ebfa60262b8e4a750472efe1bf6e06d9e7daf440d5e53fd67292f0191f016aeb3e6eed775b68b632c0f482bae831e44e1629639ec6d34
- Bump version: 23.1.1 → 23.1.2 (configure.ac) - Update flatpak metainfo with v23.1.2 release entry - Update release notes: include all v23.1.1 notes plus dashpay#7208 and dashpay#7209 - Update set-of-changes comparison link - Add PastaPastaPasta to credits v23.1.1 was scrubbed; releasing as v23.1.2 instead.
…ent signing race 881bbce fix(chainlock): serialize TrySignChainTip to prevent concurrent signing race (PastaClaw) Pull request description: ## Motivation `TrySignChainTip()` can be called concurrently from two independent threads: 1. The signer's own `m_scheduler` thread (every 5 seconds, via `Start()`) 2. The `CValidationInterface` scheduler thread (via `UpdatedBlockTip()`) The function only briefly locks `cs_signer` around the `lastSignedHeight` check, then releases it to perform chain tip reads, IS lock checks, and eventually calls `AsyncSignIfMember` — only re-acquiring `cs_signer` to update `lastSignedHeight`/`lastSignedRequestId`/`lastSignedMsgHash` right before signing. Under competing tips (short reorg between the two reads of `m_chainstate.m_chain.Tip()`), two concurrent calls can both pass the height check with different tips and issue `AsyncSignIfMember` with different block hashes for the same height-based `requestId`. This splits signing shares across different messages and can prevent a chainlock from forming. ## Fix Add a dedicated `cs_try_sign` mutex that serializes the entire `TrySignChainTip()` function. This ensures only one thread evaluates and signs a tip at a time. Lock ordering: `cs_try_sign` → `cs_main` → `cs_signer` (no deadlock risk as `cs_try_sign` is only acquired at `TrySignChainTip` entry). ## Validation - Confirmed the signer's scheduler and `CValidationInterface` scheduler are independent threads with no cross-synchronization for `TrySignChainTip` - The handler's `tryLockChainTipScheduled` atomic only serializes within the handler scheduler, not across the signer scheduler - Verified lock ordering: `cs_try_sign` is always acquired first, no other code path acquires it, so no deadlock possible - Thread safety annotations updated on both `TrySignChainTip()` and `UpdatedBlockTip()` declarations ## Checklist - [x] I have performed a self-review of my own code - [x] I have commented my code, particularly in hard-to-understand areas - [ ] I have added or updated relevant unit/integration/functional/e2e tests - [ ] I have made corresponding changes to the documentation - [ ] I have assigned this pull request to a milestone ACKs for top commit: PastaPastaPasta: utACK 881bbce UdjinM6: utACK 881bbce Tree-SHA512: b7746f0bf12063702e7ebfa60262b8e4a750472efe1bf6e06d9e7daf440d5e53fd67292f0191f016aeb3e6eed775b68b632c0f482bae831e44e1629639ec6d34
into v23.1.x da16809 docs: add #7221 and #7222 to v23.1.2 release notes (PastaClaw) 8a93926 Merge #7222: fix: properly skip evodb repair on reindex (pasta) b74b549 Merge #7221: refactor: rename bitcoin-util manpage and test references to dash-util (pasta) 81464ac Merge #7211: fix: qt info tab layout (pasta) 7c27c2f Merge #7209: fix: serialize TrySignChainTip to prevent concurrent signing race (pasta) 81d5eb2 Merge #7208: fix: skip collecting block txids during IBD to prevent unbounded memory growth (pasta) Pull request description: ## Backport Cherry-picks of #7208, #7209, #7211, #7221, and #7222 into `v23.1.x` for v23.1.2. ### Included - #7208 — `fix: skip collecting block txids during IBD to prevent unbounded memory growth` - #7209 — `fix: serialize TrySignChainTip to prevent concurrent signing race` - #7211 — `fix: qt info tab layout` - #7221 — `refactor: rename bitcoin-util manpage and test references to dash-util` - #7222 — `fix: properly skip evodb repair on reindex` ACKs for top commit: UdjinM6: utACK da16809 Tree-SHA512: bbe74a62fd34bdcaece22100050072706774854db47d61af644700ca063e2a3cdfd474fa451681861bf1a6e91436ebd3715838640fc992fbba7c2b57b4f02760
da16809 docs: add #7221 and #7222 to v23.1.2 release notes (PastaClaw) 8a93926 Merge #7222: fix: properly skip evodb repair on reindex (pasta) b74b549 Merge #7221: refactor: rename bitcoin-util manpage and test references to dash-util (pasta) 81464ac Merge #7211: fix: qt info tab layout (pasta) 7c27c2f Merge #7209: fix: serialize TrySignChainTip to prevent concurrent signing race (pasta) 81d5eb2 Merge #7208: fix: skip collecting block txids during IBD to prevent unbounded memory growth (pasta) d02243c ci: run check-skip on blacksmith with GitHub-hosted fallback (PastaClaw) 033b3fe chore: regenerate manpages for v23.1.2 (PastaClaw) ff965b5 chore: v23.1.2 release preparation (PastaClaw) 8d5936d chore: add #7191 and #7193 to v23.1.1 release notes (PastaClaw) 9f3662b chore: v23.1.1 release preparation (PastaClaw) 5dbfa98 chore: v23.1.1 release preparation (PastaClaw) 240a95f Merge #7193: fix: reject identity elements in deserialization and key generation (pasta) 444cbf2 Merge #7191: fix(qt): reseat quorum labels when new types are inserted (pasta) 00f590d Merge #7180: qt: add Tahoe styled icons for macOS, runtime styling for each network type, update bundle icon, add mask-based tray icon, generation scripts (pasta) 60dda51 Merge #7176: perf: do linear lookup instead building 2 heavy Hash-Maps (pasta) df1ca87 Merge #7159: feat(qt): UI refresh (5/n, add proposal information widget to information, donut chart for proposal allocation) (pasta) 9061ad0 Merge #7118: feat(qt): UI refresh (4/n, introduce distinct widgets for Dash-specific reporting in debug window) (pasta) 64cc4f2 Merge #7160: feat(interfaces): consolidate masternode counts into one struct, expose chainlock, instantsend, credit pool, quorum statistics (pasta) 5d28a69 Merge #7157: fix(qt): prevent banned masternodes from returning status=0 (pasta) e0b7386 Merge #7146: feat(qt): introduce framework for sourcing and applying data, use for `{Masternode,Proposal}List`s (pasta) 8fd53cd Merge #7144: feat(qt): add support for reporting `OP_RETURN` payloads as Data Transactions (pasta) cc6f0bb Merge #7154: fix: MN update notifications had old_list/new_list swapped (pasta) 33f0138 Merge #7145: fix(qt): move labelError styling from proposalcreate.ui into general.css (pasta) 1bdbde6 Merge #7148: feat(qt): persist filter preferences in masternode list (pasta) 96bb601 Merge #7147: fix(qt): prevent overview page font double scaling, recalculate minimum width correctly, `SERVICE` and `STATUS` sorting, fix common types filtering (pasta) da1e336 build: expand minimum Darwin version to macOS 11 (Big Sur) (Kittywhiskers Van Gogh) Pull request description: ## Issue being fixed or feature implemented Note: Skipping changes from #7149 which was for the v23.1.x only. ## What was done? ## How Has This Been Tested? ## Breaking Changes ## Checklist: - [ ] I have performed a self-review of my own code - [ ] I have commented my code, particularly in hard-to-understand areas - [ ] I have added or updated relevant unit/integration/functional/e2e tests - [ ] I have made corresponding changes to the documentation - [ ] I have assigned this pull request to a milestone _(for repository code-owners and collaborators only)_ ACKs for top commit: kwvg: utACK 36988f9 Tree-SHA512: f5bf8f0af11379bbcea606108ee90af08c16f588ebdbfac1fdd567adfcad14926b9c797c8fa6b398fb65fc3210c5f2c084015ea07718371df04e2412625f42d4
Motivation
TrySignChainTip()can be called concurrently from two independent threads:m_schedulerthread (every 5 seconds, viaStart())CValidationInterfacescheduler thread (viaUpdatedBlockTip())The function only briefly locks
cs_signeraround thelastSignedHeightcheck, then releases it to perform chain tip reads, IS lock checks, and eventually callsAsyncSignIfMember— only re-acquiringcs_signerto updatelastSignedHeight/lastSignedRequestId/lastSignedMsgHashright before signing.Under competing tips (short reorg between the two reads of
m_chainstate.m_chain.Tip()), two concurrent calls can both pass the height check with different tips and issueAsyncSignIfMemberwith different block hashes for the same height-basedrequestId. This splits signing shares across different messages and can prevent a chainlock from forming.Fix
Add a dedicated
cs_try_signmutex that serializes the entireTrySignChainTip()function. This ensures only one thread evaluates and signs a tip at a time.Lock ordering:
cs_try_sign→cs_main→cs_signer(no deadlock risk ascs_try_signis only acquired atTrySignChainTipentry).Validation
CValidationInterfacescheduler are independent threads with no cross-synchronization forTrySignChainTiptryLockChainTipScheduledatomic only serializes within the handler scheduler, not across the signer schedulercs_try_signis always acquired first, no other code path acquires it, so no deadlock possibleTrySignChainTip()andUpdatedBlockTip()declarationsChecklist