p2p: Replace per-peer transaction rate-limiting with global rate limits by ajtowns · Pull Request #34628 · bitcoin/bitcoin

ajtowns · 2026-02-20T03:09:59Z

Per-peer m_tx_inventory_to_send queues have CPU and memory costs that scale with both queue size and peer count. Under high transaction volume, this has previously caused severe issues (May 2023 disclosure) and still can cause measurable delays (Feb 2026 Runestone surge, with the msghand thread observed hitting 100% CPU and queue memory reaching ~95MB).

This PR replaces the per-peer rate limiting with a global queue using dual token buckets (limiting transaction by both count and serialized size). Transactions that arrive within the bucket capacity still relay nearly immediately, but excess transactions queue in a global backlog and drain as the token buckets refill.

Key parameters:

Count bucket: 14 tx/s, 420 capacity (30s buffer)
Size bucket: 20 kB/s (~12 MB/600s), 50 MB capacity
Outbound peers refill faster by a factor of 2.5

Per-peer queues are retained solely for privacy batching and are always fully emptied, removing the old INVENTORY_BROADCAST_MAX cap.

This reduces the memory and CPU burden during transaction spikes when the queuing logic is engaged from O(queue * peers) to O(queue), as the queued transactions no longer need to be retained per-peer or re-sorted per-peer.

Design discussion: https://gist.github.com/ajtowns/d61bea974a07190fa6c6c8eaef3638b9

DrahtBot · 2026-02-20T03:10:47Z

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Reviews

See the guideline for information on the review process.

Type	Reviewers
Concept ACK	0xB10C

If your review is incorrectly listed, please copy-paste  into the comment that the bot should ignore.

Conflicts

Reviewers, this pull request conflicts with the following ones:

#34824 (net: refactor: replace Peer::TxRelay RecursiveMutex instances with Mutex by w0xlt)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

LLM Linter (✨ experimental)

Possible typos and grammar issues:

Multiplier for the inventory bucket rate for outbounds -> Multiplier for the inventory bucket rate for outbounds [“outbounds” is an incorrect/awkward plural form here; it should refer to outbound/inventory for outbounds/outbound peers—e.g., “outbound” (inventory bucket rate for outbound peers).]
# Confirm the replaced original was never announced, replacement was -> # Confirm the replaced original was never announced, replacement was announced [Comment is truncated/incomplete (“replacement was” with no completion), which hurts comprehension.]

^{2026-03-21 04:15:46}

ajtowns · 2026-02-20T12:43:44Z

CI failure is presumably either #34631 or #34387

chriszeng1010 · 2026-03-02T17:33:04Z

src/util/tokenbucket.h

+    /** Consume n tokens. Returns false if the balance dropped below m_max_debt. */
+    bool decrement(double n = 1.0)
+    {
+        m_value -= n;


Decrement can still go below m_max_debt before checking is complete.

decrement() can always go below m_max_debt, it only reports when it has done so -- it leaves it up to the caller to not go further into debt.

0xB10C · 2026-03-12T09:45:21Z

Concept ACK!

I've been running this for a few days now and written down a few observations on a small mass-broadcast event that happend a few hours ago: https://bnoc.xyz/t/increased-b-msghand-thread-utilization-due-to-runestone-transactions-on-2026-02-17/81/11

The node with this patch was significantly less affected than the others running a recent master.

I haven't set up any monitoring for the newly added getnetworkinfo fields yet.

src/txmempool.cpp

sipa · 2026-03-20T18:46:56Z

src/txmempool.cpp

    assert(innerUsage == cachedInnerUsage);
 }

+std::vector<CTxMemPool::txiter> CTxMemPool::SortMiningScoreWithTopology(std::span<const Wtxid> wtxids, size_t n) const


It looks like both eventual production call sites of this function (BumpInvVecForProcessing and PeerManagerImpl::SendMessages) do a deduplication pass on the results.

Would it make sense to do this on the fly inside this function? It can't use std::partial_sort anymore, but it can use std::make_heap and friends to implement partial sorting, with a dynamic end point until n distinct elements have been found? Something like

std::vector<CTxMemPool::txiter> CTxMemPool::SortMiningScoreWithTopology(std::span<const Wtxid> wtxids, size_t n) const { auto cmp = [&](const auto& a, const auto& b) EXCLUSIVE_LOCKS_REQUIRED(cs) noexcept { return m_txgraph->CompareMainOrder(*a, *b) > 0; }; std::vector<txiter> res; n = std::min(wtxids.size(), n); if (n > 0) { // Construct a heap with txiters for all wtxids that exist in the mempool. std::vector<txiter> heap; heap.reserve(wtxids.size()); for (auto& wtxid : wtxids) { if (auto i{GetIter(wtxid)}; i.has_value()) { heap.push_back(i.value()); } } std::ranges::make_heap(heap, cmp); // Pop transactions until n distinct ones in res have been found. res.reserve(heap.size()); while (res.size() < n && !heap.empty()) { std::ranges::pop_heap(heap, cmp); if (res.empty() || heap.back() != res.back()) { res.push_back(heap.back()); } heap.pop_back(); } // Copy the remainder over, without sorting or deduplication. res.insert(res.end(), heap.begin(), heap.end()); } return res; }

With even more low-level code the duplicate vector can be avoided, I think. Tests don't pass with this, I haven't investigated why.

Without having looked, is the comparison backwards?

My understanding is partial_sort has two benefits:

it only makes a heap out of the target size, so iterates through the source array once and then does log(k) work for each element, with better locality

when updating the k elements in the heap with a new element from the source, it does the sift-down algorithm which is more efficient than heap_push()/heap_pop(), but isn't exposed via the STL so would mean writing your own heap implementation

Deduping the fairly small output list as you pass through it, when duplicates are rare anyway, seemed fine to me?

Without having looked, is the comparison backwards?

I don't think so. It's a max heap, but I want to pop the "lowest" elements off first, so I needed to swap the comparator I think.

My understanding is partial_sort has two benefits:

Interesting. So it has complexity O(n log m) (with n = number of elements, m = sorted prefix size), while the approach I have in mind is O(n + (m log n)) (O(n) to construct the heap of all elements, and then m operations of each O(log n) to extract the best m elements. Complexity-wise, my approach seems better, since n > m, except it has worse memory locality. This makes me wonder if I'm missing something, since the cppreference.cpp documentation seems to imply std::partial_sort is intended for low m values.

Deduping the fairly small output list as you pass through it, when duplicates are rare anyway, seemed fine to me?

Yeah, it probably doesn't matter that much. It just looked like the deduplication is something that CTxMemPool::SortMiningScoreWithTopology could do internally since both call sites need it anyway. And then it seemed possible to have the count be dynamic, but only when using a different approach than what std::partial_sort seems to enable.

src/txmempool.cpp

sipa · 2026-03-20T19:54:43Z

src/util/tokenbucket.h

+ *
+ * Tokens are added at a steady rate (m_rate per second) up to a capacity
+ * cap (m_cap). Tokens are removed by calling decrement(). The balance
+ * may go negative down to m_max_debt; decrement() returns false when


In commit "util/tokenbucket.h: Provide a generic TokenBucket class"

Is it useful to support debt? I believe it can be avoided by a transformation that raises both m_value and m_cap by -m_max_debt.

The distinction is in InvToSendBucket::avail() which says "you can start doing stuff as long as the size_bucket's value is >=0", which would have to get incremented as well to be equivalent.

The main effect is that when the size bucket is under pressure, you get a chance to relay at least 50kB each iteration, rather than the avail test passing as soon as you can relay 1B, then the loop ending immediately after you relay the first tx.

I think it can be simplified a bit by moving the max_debt value to just being a parameter of decrement() though. Will update.

All callers already have the full transaction available, so just pass that through. This matches the signature for InitiateTxBroadcastPrivate() and prepares for a later commit that needs the full transaction to compute its serialized size for rate limiting.

Per-peer rate limiting introduces storage and compute costs proportional to the number of peers. This has caused severe bugs in the past, and continues to be a risk in the event of periods of extremely high rates of transaction submission. Avoid these problems by always completely emptying the m_tx_inventory_to_send queue when processing it. Note that this increases the potential size of INV messages we send for normal tx relay from ~1000 (limited by INVENTORY_BROADCAST_MAX) to potentially 50000 (limited by MAX_INV_SZ).

Add a method for sorting a batch of transactions (specified as a vector of wtxids) per mempool order, designed for transaction relay.

Change the per-peer tx relay queue from std::set to std::vector. This reduces the memory usage and improves locality, at the cost of not automatically deduping entries.

Now unused; replaced by SortMiningScoreWithTopology.

This is a simple token bucket parameterized on clock type, used in the following commit.

Without the per-peer rate limiting, nodes can act as an amplifier for transaction spam -- receiving many transactions from one node, but relaying each of them to over 100 other nodes. Limit the impact of this by providing a global rate limit. This is implemented using dual token buckets, one that consumes a token for every transaction, and one that consumes a token for every serialized byte. This rate limits both per-tx resource usage (eg INV messages) and overall relay bandwidth. Main bucket parameters: * Count: 14tx/s rate, 420tx (30s) capacity * Size: 12MB/600s rate (4-6 blocks per target block interval), 50MB capacity The size bucket is expected to be large enough to almost never have an impact in normal usage, even during transaction storms, and is primarily intended to mitigate attack-like scenarios. Outbound connections get a separate pair of buckets, with rates boosted by a 2.5x multiplier. This avoids the excessive memory and CPU usage due to the 100x multiplier from the queues being per-peer. Note that this also reduces the size of INV messages we send for general tx relay back to a more reasonable level of under 600 txs in 99.999% of cases.

Adds a debug-only configuration option to set the target transaction/second rate for relay to inbound connections. This is mostly intended to be set to artificially low values to aid in testing behaviour when a backlog occurs, but is also available in case the default 14tx/s target is somehow too low in practice.

Add `tx_send_rate` and `inv_buckets` fields to getnetworkinfo. The latter has `inbound` and `outbound` entries, reporting reports backlog count, count tokens, and size tokens. Useful for monitoring relay behavior.

DrahtBot added the P2P label Feb 20, 2026

DrahtBot added the CI failed label Feb 20, 2026

ajtowns force-pushed the 202602-mempool-invtosend branch from a3d3fe3 to cbae021 Compare February 20, 2026 04:31

ajtowns mentioned this pull request Feb 20, 2026

RFC: Erlay Conceptual Discussion #34542

Open

DrahtBot mentioned this pull request Feb 20, 2026

refactor: decompose Peer struct into focused sub-components #34588

Open

This was referenced Feb 20, 2026

refactor: add overflow-safe CeilDiv helper and use it in unsigned callsites #34436

Merged

Bugfix: Correct first-run free space checks #29678

Open

DrahtBot removed the CI failed label Feb 24, 2026

DrahtBot mentioned this pull request Feb 25, 2026

net_processing: make m_tx_for_private_broadcast optional #34271

Open

chriszeng1010 reviewed Mar 2, 2026

View reviewed changes

DrahtBot added the Needs rebase label Mar 11, 2026

ajtowns force-pushed the 202602-mempool-invtosend branch from cbae021 to 4b961e3 Compare March 12, 2026 00:30

DrahtBot removed the Needs rebase label Mar 12, 2026

DrahtBot added the CI failed label Mar 12, 2026

DrahtBot closed this Mar 12, 2026

DrahtBot reopened this Mar 12, 2026

ajtowns added this to the 32.0 milestone Mar 12, 2026

DrahtBot removed the CI failed label Mar 12, 2026

DrahtBot mentioned this pull request Mar 16, 2026

net: refactor: replace Peer::TxRelay RecursiveMutex instances with Mutex #34824

Open

sipa reviewed Mar 20, 2026

View reviewed changes

ajtowns added 7 commits March 21, 2026 12:22

txmempool: Add SortMiningScoreWithTopology

0301ba3

Add a method for sorting a batch of transactions (specified as a vector of wtxids) per mempool order, designed for transaction relay.

net_processing: Change m_tx_inventory_to_send from set to vector

cdf9d2c

Change the per-peer tx relay queue from std::set to std::vector. This reduces the memory usage and improves locality, at the cost of not automatically deduping entries.

txmempool: Drop CompareMiningScoreWithTopology

bd28c57

Now unused; replaced by SortMiningScoreWithTopology.

util/tokenbucket.h: Provide a generic TokenBucket class

4a60765

This is a simple token bucket parameterized on clock type, used in the following commit.

ajtowns added 3 commits March 21, 2026 12:22

rpc: report -txsendrate and bucket info via getnetworkinfo

169b9cb

Add `tx_send_rate` and `inv_buckets` fields to getnetworkinfo. The latter has `inbound` and `outbound` entries, reporting reports backlog count, count tokens, and size tokens. Useful for monitoring relay behavior.

tests: basic functional test for tx rate limiting

3a5b54c

ajtowns force-pushed the 202602-mempool-invtosend branch from 4b961e3 to 3a5b54c Compare March 21, 2026 04:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

p2p: Replace per-peer transaction rate-limiting with global rate limits#34628

p2p: Replace per-peer transaction rate-limiting with global rate limits#34628
ajtowns wants to merge 10 commits intobitcoin:masterfrom
ajtowns:202602-mempool-invtosend

ajtowns commented Feb 20, 2026 •

edited

Loading

Uh oh!

DrahtBot commented Feb 20, 2026 •

edited

Loading

Uh oh!

ajtowns commented Feb 20, 2026

Uh oh!

chriszeng1010 Mar 2, 2026

Uh oh!

ajtowns Mar 4, 2026

Uh oh!

0xB10C commented Mar 12, 2026

Uh oh!

Uh oh!

sipa Mar 20, 2026

Uh oh!

ajtowns Mar 21, 2026

Uh oh!

sipa Mar 21, 2026 •

edited

Loading

Uh oh!

Uh oh!

sipa Mar 20, 2026

Uh oh!

ajtowns Mar 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

ajtowns commented Feb 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DrahtBot commented Feb 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews

Conflicts

LLM Linter (✨ experimental)

Uh oh!

ajtowns commented Feb 20, 2026

Uh oh!

chriszeng1010 Mar 2, 2026

Choose a reason for hiding this comment

Uh oh!

ajtowns Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

0xB10C commented Mar 12, 2026

Uh oh!

Uh oh!

sipa Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

ajtowns Mar 21, 2026

Choose a reason for hiding this comment

Uh oh!

sipa Mar 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

sipa Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

ajtowns Mar 21, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

ajtowns commented Feb 20, 2026 •

edited

Loading

DrahtBot commented Feb 20, 2026 •

edited

Loading

sipa Mar 21, 2026 •

edited

Loading