Skip to content

fix: serialize TrySignChainTip to prevent concurrent signing race#7209

Merged
PastaPastaPasta merged 1 commit intodashpay:developfrom
thepastaclaw:fix/chainlock-concurrent-signing-race
Mar 12, 2026
Merged

fix: serialize TrySignChainTip to prevent concurrent signing race#7209
PastaPastaPasta merged 1 commit intodashpay:developfrom
thepastaclaw:fix/chainlock-concurrent-signing-race

Conversation

@thepastaclaw
Copy link

Motivation

TrySignChainTip() can be called concurrently from two independent threads:

  1. The signer's own m_scheduler thread (every 5 seconds, via Start())
  2. The CValidationInterface scheduler thread (via UpdatedBlockTip())

The function only briefly locks cs_signer around the lastSignedHeight check, then releases it to perform chain tip reads, IS lock checks, and eventually calls AsyncSignIfMember — only re-acquiring cs_signer to update lastSignedHeight/lastSignedRequestId/lastSignedMsgHash right before signing.

Under competing tips (short reorg between the two reads of m_chainstate.m_chain.Tip()), two concurrent calls can both pass the height check with different tips and issue AsyncSignIfMember with different block hashes for the same height-based requestId. This splits signing shares across different messages and can prevent a chainlock from forming.

Fix

Add a dedicated cs_try_sign mutex that serializes the entire TrySignChainTip() function. This ensures only one thread evaluates and signs a tip at a time.

Lock ordering: cs_try_signcs_maincs_signer (no deadlock risk as cs_try_sign is only acquired at TrySignChainTip entry).

Validation

  • Confirmed the signer's scheduler and CValidationInterface scheduler are independent threads with no cross-synchronization for TrySignChainTip
  • The handler's tryLockChainTipScheduled atomic only serializes within the handler scheduler, not across the signer scheduler
  • Verified lock ordering: cs_try_sign is always acquired first, no other code path acquires it, so no deadlock possible
  • Thread safety annotations updated on both TrySignChainTip() and UpdatedBlockTip() declarations

Checklist

  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have added or updated relevant unit/integration/functional/e2e tests
  • I have made corresponding changes to the documentation
  • I have assigned this pull request to a milestone

@thepastaclaw
Copy link
Author

@coderabbitai review

@coderabbitai
Copy link

coderabbitai bot commented Mar 11, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@UdjinM6 UdjinM6 added this to the 23.1.2 milestone Mar 11, 2026
@coderabbitai
Copy link

coderabbitai bot commented Mar 11, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 0c3d9510-8c76-4f1b-9de9-0478569dbd4e

📥 Commits

Reviewing files that changed from the base of the PR and between b1ce7f9 and 881bbce.

📒 Files selected for processing (2)
  • src/chainlock/signing.cpp
  • src/chainlock/signing.h
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/chainlock/signing.h

Walkthrough

The pull request adds a new private mutable mutex cs_try_sign to the ChainLockSigner class, updates lock annotations for UpdatedBlockTip() and TrySignChainTip() to require exclusive access to both cs_try_sign and cs_signer, and acquires cs_try_sign at the start of TrySignChainTip() (using a TRY_LOCK guard), returning early if the lock cannot be obtained. Changes are confined to two files and five lines.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main change: adding serialization to TrySignChainTip via a new mutex to prevent concurrent signing races.
Description check ✅ Passed The description provides comprehensive context about the concurrency issue, the specific fix, validation performed, and lock ordering analysis related to the changeset.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
src/chainlock/signing.h (1)

80-81: Encode ::cs_main in the lock contract.

TrySignChainTip() reacquires ::cs_main at Line 88 in src/chainlock/signing.cpp, so callers already need to enter with ::cs_main unlocked. Adding !::cs_main here would make that precondition—and the intended cs_try_sign -> ::cs_main -> cs_signer order—machine-checkable.

🔒 Suggested annotation tweak
     void UpdatedBlockTip(const CBlockIndex* pindexNew, const CBlockIndex* pindexFork, bool fInitialDownload) override
-        EXCLUSIVE_LOCKS_REQUIRED(!cs_try_sign, !cs_signer);
+        EXCLUSIVE_LOCKS_REQUIRED(!::cs_main, !cs_try_sign, !cs_signer);

 private:
-    void TrySignChainTip() EXCLUSIVE_LOCKS_REQUIRED(!cs_try_sign, !cs_signer);
+    void TrySignChainTip() EXCLUSIVE_LOCKS_REQUIRED(!::cs_main, !cs_try_sign, !cs_signer);

Also applies to: 89-89

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/chainlock/signing.h` around lines 80 - 81, UpdatedBlockTip's lock
annotation omits ::cs_main so the expected lock order (cs_try_sign -> ::cs_main
-> cs_signer) isn't machine-checkable; update the EXCLUSIVE_LOCKS_REQUIRED
contract on UpdatedBlockTip to include !::cs_main along with !cs_try_sign and
!cs_signer so callers must hold cs_try_sign and release ::cs_main before
acquiring cs_signer (aligning with TrySignChainTip which reacquires ::cs_main),
and ensure the same change is applied to the analogous declaration at the other
referenced location.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/chainlock/signing.h`:
- Around line 80-81: UpdatedBlockTip's lock annotation omits ::cs_main so the
expected lock order (cs_try_sign -> ::cs_main -> cs_signer) isn't
machine-checkable; update the EXCLUSIVE_LOCKS_REQUIRED contract on
UpdatedBlockTip to include !::cs_main along with !cs_try_sign and !cs_signer so
callers must hold cs_try_sign and release ::cs_main before acquiring cs_signer
(aligning with TrySignChainTip which reacquires ::cs_main), and ensure the same
change is applied to the analogous declaration at the other referenced location.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 5047b082-cae5-495f-bcec-5e365e329de2

📥 Commits

Reviewing files that changed from the base of the PR and between 1598859 and b1ce7f9.

📒 Files selected for processing (2)
  • src/chainlock/signing.cpp
  • src/chainlock/signing.h

@github-actions
Copy link

github-actions bot commented Mar 11, 2026

✅ No Merge Conflicts Detected

This PR currently has no conflicts with other open PRs.


void ChainLockSigner::TrySignChainTip()
{
LOCK(cs_try_sign);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would make sense to try lock and then if we can't acquire the lock just return here We don't really need to wait and then we run it immediately after This function will get called occasionally anyhow.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call — switched to TRY_LOCK with an early return if the lock can't be acquired. Since TrySignChainTip is called periodically anyway, there's no need to block waiting. Force-pushed.

@PastaPastaPasta PastaPastaPasta changed the title fix(chainlock): serialize TrySignChainTip to prevent concurrent signing race fix(): serialize TrySignChainTip to prevent concurrent signing race Mar 11, 2026
@PastaPastaPasta PastaPastaPasta changed the title fix(): serialize TrySignChainTip to prevent concurrent signing race fix: serialize TrySignChainTip to prevent concurrent signing race Mar 11, 2026
…ng race

TrySignChainTip can be called concurrently from two independent threads:
the signer's own scheduler (every 5s) and the CValidationInterface
scheduler (via UpdatedBlockTip). The function only briefly locks
cs_signer around the lastSignedHeight check before computing the
requestId/msgHash and calling AsyncSignIfMember.

Under competing tips (short reorg), two concurrent calls can both pass
the height check with different tips, update lastSignedHeight/msgHash
non-atomically, and issue AsyncSignIfMember with different block hashes
for the same height-based requestId. This splits signing shares and can
prevent chainlock formation.

Add a dedicated cs_try_sign mutex that serializes the entire
TrySignChainTip function. Lock ordering: cs_try_sign -> cs_main ->
cs_signer (no deadlock risk as cs_try_sign is only acquired here).
@thepastaclaw thepastaclaw force-pushed the fix/chainlock-concurrent-signing-race branch from b1ce7f9 to 881bbce Compare March 11, 2026 21:35
@PastaPastaPasta PastaPastaPasta marked this pull request as ready for review March 12, 2026 05:06
Copy link
Member

@PastaPastaPasta PastaPastaPasta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

utACK 881bbce

Copy link

@UdjinM6 UdjinM6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

utACK 881bbce

@PastaPastaPasta PastaPastaPasta merged commit 70733ac into dashpay:develop Mar 12, 2026
45 of 48 checks passed
thepastaclaw pushed a commit to thepastaclaw/dash that referenced this pull request Mar 12, 2026
…ent signing race

881bbce fix(chainlock): serialize TrySignChainTip to prevent concurrent signing race (PastaClaw)

Pull request description:

  ## Motivation

  `TrySignChainTip()` can be called concurrently from two independent threads:

  1. The signer's own `m_scheduler` thread (every 5 seconds, via `Start()`)
  2. The `CValidationInterface` scheduler thread (via `UpdatedBlockTip()`)

  The function only briefly locks `cs_signer` around the `lastSignedHeight` check, then releases it to perform chain tip reads, IS lock checks, and eventually calls `AsyncSignIfMember` — only re-acquiring `cs_signer` to update `lastSignedHeight`/`lastSignedRequestId`/`lastSignedMsgHash` right before signing.

  Under competing tips (short reorg between the two reads of `m_chainstate.m_chain.Tip()`), two concurrent calls can both pass the height check with different tips and issue `AsyncSignIfMember` with different block hashes for the same height-based `requestId`. This splits signing shares across different messages and can prevent a chainlock from forming.

  ## Fix

  Add a dedicated `cs_try_sign` mutex that serializes the entire `TrySignChainTip()` function. This ensures only one thread evaluates and signs a tip at a time.

  Lock ordering: `cs_try_sign` → `cs_main` → `cs_signer` (no deadlock risk as `cs_try_sign` is only acquired at `TrySignChainTip` entry).

  ## Validation

  - Confirmed the signer's scheduler and `CValidationInterface` scheduler are independent threads with no cross-synchronization for `TrySignChainTip`
  - The handler's `tryLockChainTipScheduled` atomic only serializes within the handler scheduler, not across the signer scheduler
  - Verified lock ordering: `cs_try_sign` is always acquired first, no other code path acquires it, so no deadlock possible
  - Thread safety annotations updated on both `TrySignChainTip()` and `UpdatedBlockTip()` declarations

  ## Checklist

  - [x] I have performed a self-review of my own code
  - [x] I have commented my code, particularly in hard-to-understand areas
  - [ ] I have added or updated relevant unit/integration/functional/e2e tests
  - [ ] I have made corresponding changes to the documentation
  - [ ] I have assigned this pull request to a milestone

ACKs for top commit:
  PastaPastaPasta:
    utACK 881bbce
  UdjinM6:
    utACK 881bbce

Tree-SHA512: b7746f0bf12063702e7ebfa60262b8e4a750472efe1bf6e06d9e7daf440d5e53fd67292f0191f016aeb3e6eed775b68b632c0f482bae831e44e1629639ec6d34
thepastaclaw added a commit to thepastaclaw/dash that referenced this pull request Mar 12, 2026
- Bump version: 23.1.1 → 23.1.2 (configure.ac)
- Update flatpak metainfo with v23.1.2 release entry
- Update release notes: include all v23.1.1 notes plus dashpay#7208 and dashpay#7209
- Update set-of-changes comparison link
- Add PastaPastaPasta to credits

v23.1.1 was scrubbed; releasing as v23.1.2 instead.
thepastaclaw pushed a commit to thepastaclaw/dash that referenced this pull request Mar 14, 2026
…ent signing race

881bbce fix(chainlock): serialize TrySignChainTip to prevent concurrent signing race (PastaClaw)

Pull request description:

  ## Motivation

  `TrySignChainTip()` can be called concurrently from two independent threads:

  1. The signer's own `m_scheduler` thread (every 5 seconds, via `Start()`)
  2. The `CValidationInterface` scheduler thread (via `UpdatedBlockTip()`)

  The function only briefly locks `cs_signer` around the `lastSignedHeight` check, then releases it to perform chain tip reads, IS lock checks, and eventually calls `AsyncSignIfMember` — only re-acquiring `cs_signer` to update `lastSignedHeight`/`lastSignedRequestId`/`lastSignedMsgHash` right before signing.

  Under competing tips (short reorg between the two reads of `m_chainstate.m_chain.Tip()`), two concurrent calls can both pass the height check with different tips and issue `AsyncSignIfMember` with different block hashes for the same height-based `requestId`. This splits signing shares across different messages and can prevent a chainlock from forming.

  ## Fix

  Add a dedicated `cs_try_sign` mutex that serializes the entire `TrySignChainTip()` function. This ensures only one thread evaluates and signs a tip at a time.

  Lock ordering: `cs_try_sign` → `cs_main` → `cs_signer` (no deadlock risk as `cs_try_sign` is only acquired at `TrySignChainTip` entry).

  ## Validation

  - Confirmed the signer's scheduler and `CValidationInterface` scheduler are independent threads with no cross-synchronization for `TrySignChainTip`
  - The handler's `tryLockChainTipScheduled` atomic only serializes within the handler scheduler, not across the signer scheduler
  - Verified lock ordering: `cs_try_sign` is always acquired first, no other code path acquires it, so no deadlock possible
  - Thread safety annotations updated on both `TrySignChainTip()` and `UpdatedBlockTip()` declarations

  ## Checklist

  - [x] I have performed a self-review of my own code
  - [x] I have commented my code, particularly in hard-to-understand areas
  - [ ] I have added or updated relevant unit/integration/functional/e2e tests
  - [ ] I have made corresponding changes to the documentation
  - [ ] I have assigned this pull request to a milestone

ACKs for top commit:
  PastaPastaPasta:
    utACK 881bbce
  UdjinM6:
    utACK 881bbce

Tree-SHA512: b7746f0bf12063702e7ebfa60262b8e4a750472efe1bf6e06d9e7daf440d5e53fd67292f0191f016aeb3e6eed775b68b632c0f482bae831e44e1629639ec6d34
PastaPastaPasta added a commit that referenced this pull request Mar 14, 2026
 into v23.1.x

da16809 docs: add #7221 and #7222 to v23.1.2 release notes (PastaClaw)
8a93926 Merge #7222: fix: properly skip evodb repair on reindex (pasta)
b74b549 Merge #7221: refactor: rename bitcoin-util manpage and test references to dash-util (pasta)
81464ac Merge #7211: fix: qt info tab layout (pasta)
7c27c2f Merge #7209: fix: serialize TrySignChainTip to prevent concurrent signing race (pasta)
81d5eb2 Merge #7208: fix: skip collecting block txids during IBD to prevent unbounded memory growth (pasta)

Pull request description:

  ## Backport

  Cherry-picks of #7208, #7209, #7211, #7221, and #7222 into `v23.1.x` for v23.1.2.

  ### Included

  - #7208 — `fix: skip collecting block txids during IBD to prevent unbounded memory growth`
  - #7209 — `fix: serialize TrySignChainTip to prevent concurrent signing race`
  - #7211 — `fix: qt info tab layout`
  - #7221 — `refactor: rename bitcoin-util manpage and test references to dash-util`
  - #7222 — `fix: properly skip evodb repair on reindex`

ACKs for top commit:
  UdjinM6:
    utACK da16809

Tree-SHA512: bbe74a62fd34bdcaece22100050072706774854db47d61af644700ca063e2a3cdfd474fa451681861bf1a6e91436ebd3715838640fc992fbba7c2b57b4f02760
PastaPastaPasta added a commit that referenced this pull request Mar 16, 2026
da16809 docs: add #7221 and #7222 to v23.1.2 release notes (PastaClaw)
8a93926 Merge #7222: fix: properly skip evodb repair on reindex (pasta)
b74b549 Merge #7221: refactor: rename bitcoin-util manpage and test references to dash-util (pasta)
81464ac Merge #7211: fix: qt info tab layout (pasta)
7c27c2f Merge #7209: fix: serialize TrySignChainTip to prevent concurrent signing race (pasta)
81d5eb2 Merge #7208: fix: skip collecting block txids during IBD to prevent unbounded memory growth (pasta)
d02243c ci: run check-skip on blacksmith with GitHub-hosted fallback (PastaClaw)
033b3fe chore: regenerate manpages for v23.1.2 (PastaClaw)
ff965b5 chore: v23.1.2 release preparation (PastaClaw)
8d5936d chore: add #7191 and #7193 to v23.1.1 release notes (PastaClaw)
9f3662b chore: v23.1.1 release preparation (PastaClaw)
5dbfa98 chore: v23.1.1 release preparation (PastaClaw)
240a95f Merge #7193: fix: reject identity elements in deserialization and key generation (pasta)
444cbf2 Merge #7191: fix(qt): reseat quorum labels when new types are inserted (pasta)
00f590d Merge #7180: qt: add Tahoe styled icons for macOS, runtime styling for each network type, update bundle icon, add mask-based tray icon, generation scripts (pasta)
60dda51 Merge #7176: perf: do linear lookup instead building 2 heavy Hash-Maps (pasta)
df1ca87 Merge #7159: feat(qt): UI refresh (5/n, add proposal information widget to information, donut chart for proposal allocation) (pasta)
9061ad0 Merge #7118: feat(qt): UI refresh (4/n, introduce distinct widgets for Dash-specific reporting in debug window) (pasta)
64cc4f2 Merge #7160: feat(interfaces): consolidate masternode counts into one struct, expose chainlock, instantsend, credit pool, quorum statistics (pasta)
5d28a69 Merge #7157: fix(qt): prevent banned masternodes from returning status=0 (pasta)
e0b7386 Merge #7146: feat(qt): introduce framework for sourcing and applying data, use for `{Masternode,Proposal}List`s (pasta)
8fd53cd Merge #7144: feat(qt): add support for reporting `OP_RETURN` payloads as Data Transactions (pasta)
cc6f0bb Merge #7154: fix: MN update notifications had old_list/new_list swapped (pasta)
33f0138 Merge #7145: fix(qt): move labelError styling from proposalcreate.ui into general.css (pasta)
1bdbde6 Merge #7148: feat(qt): persist filter preferences in masternode list (pasta)
96bb601 Merge #7147: fix(qt): prevent overview page font double scaling, recalculate minimum width correctly, `SERVICE` and `STATUS` sorting, fix common types filtering (pasta)
da1e336 build: expand minimum Darwin version to macOS 11 (Big Sur) (Kittywhiskers Van Gogh)

Pull request description:

  ## Issue being fixed or feature implemented
  Note: Skipping changes from #7149 which was for the v23.1.x only.

  ## What was done?

  ## How Has This Been Tested?

  ## Breaking Changes

  ## Checklist:
  - [ ] I have performed a self-review of my own code
  - [ ] I have commented my code, particularly in hard-to-understand areas
  - [ ] I have added or updated relevant unit/integration/functional/e2e tests
  - [ ] I have made corresponding changes to the documentation
  - [ ] I have assigned this pull request to a milestone _(for repository code-owners and collaborators only)_

ACKs for top commit:
  kwvg:
    utACK 36988f9

Tree-SHA512: f5bf8f0af11379bbcea606108ee90af08c16f588ebdbfac1fdd567adfcad14926b9c797c8fa6b398fb65fc3210c5f2c084015ea07718371df04e2412625f42d4
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants