cluster mempool: introduce TxGraph by sipa · Pull Request #31363 · bitcoin/bitcoin

sipa · 2024-11-24T15:59:18Z

Part of cluster mempool: #30289.

1. Overview

This introduces the TxGraph class, which encapsulates knowledge about the (effective) fees, sizes, and dependencies between all mempool transactions, but nothing else. In particular, it lacks knowledge about CTransaction, inputs, outputs, txids, wtxids, prioritization, validatity, policy rules, and a lot more. Being restricted to just those aspects of the mempool makes the behavior very easy to fully specify (ignoring the actual linearizations produced), and write simulation-based tests for (which are included in this PR).

2. Interface

The interface can be largely categorized into:

Mutation functions:
- AddTransaction (add a new transaction with specified feerate, and get a Ref object back to identify it).
- RemoveTransaction (given a Ref object, remove the transaction).
- AddDependency (given two Ref objects, add a dependency between them).
- SetTransactionFee (modify the fee associated with a Ref object).
Inspector functions:
- GetAncestors (get the ancestor set in the form of Ref* pointers)
- GetAncestorsUnion (like above, but for the union of ancestors of multiple Ref* pointers)
- GetDescendants (get the descendant set in the form of Ref* pointers)
- GetDescendantsUnion (like above, but for the union of ancestors of multiple Ref* pointers)
- GetCluster (get the connected component set in the form of Ref* pointers, in the order they would be mined).
- GetIndividualFeerate (get the feerate of a transaction)
- GetChunkFeerate (get the mining score of a transaction)
- CountDistinctClusters (count the number of distinct clusters a list of Refs belong to)
Staging functions:
- StartStaging (make all future mutations operate on a proposed transaction graph)
- CommitStaging (apply all the changes that are staged)
- AbortStaging (discard all the changes that are staged)
Miscellaneous functions:
- DoWork (do queued-up computations now, so that future operations are fast)

This TxGraph::Ref type used as a "handle" on transactions in the graph can be inherited from, and the idea is that in the full cluster mempool implementation (#28676, after it is rebased on this), CTxMempoolEntry will inherit from it, and all actually used Ref objects will be CTxMempoolEntrys. With that, the mempool code can just cast any Ref* returned by txgraph to CTxMempoolEntry*.

3. Implementation

Internally the graph data is kept in clustered form (partitioned into connected components), for which linearizations are maintained and updated as needed using the cluster_linearize.h algorithms under the hood, but this is hidden from the users of this class. Implementation-wise, mutations are generally applied lazily, appending to queues of to-be-removed transactions and to-be-added dependencies, so they can be batched for higher performance. Inspectors will generally only evaluate as much as is needed to answer queries, with roughly 5 levels of processing to go to fully instantiated and acceptable cluster linearizations, in order:

ApplyRemovals (take batches of to-be-removed transactions and translate them to "holes" in the corresponding Clusters/DepGraphs).
SplitAll (creating holes in Clusters may cause them to break apart into smaller connected components, so make turn them into separate Clusters/linearizations).
GroupClusters (figure out which Clusters will need to be combined in order to add requested to-be-added dependencies, as these may span clusters).
ApplyDependencies (actually merge Clusters as precomputed by GroupClusters, and add the dependencies between them).
MakeAcceptable (perform the LIMO linearization algorithm on Clusters to make sure their linearizations are acceptable).

4. Future work

This is only an initial version of TxGraph, and some functionality is missing before #28676 can be rebased on top of it:

The ability to get comparative feerate diagrams before/after for the set of staged changes (to evaluate RBF incentive-compatibility).
Mining interface (ability to iterate transactions quickly in mining score order) (see cluster mempool: add txgraph diagrams/mining/eviction #31444).
Eviction interface (reverse of mining order, plus memory usage accounting) (see cluster mempool: add txgraph diagrams/mining/eviction #31444).
Ability to fix oversizedness of clusters (before or after committing) - this is needed for reorgs where aborting/rejecting the change just is not an option (see cluster mempool: add TxGraph reorg functionality #31553).
Interface for controlling how much effort is spent on LIMO. In this PR it is hardcoded.

Then there are further improvements possible which would not block other work:

Making Cluster a virtual class with different implementations based on transaction count (which could dramatically reduce memory usage, as most Clusters are just a single transaction, for which the current implementation is overkill).
The ability to have background thread(s) for improving cluster linearizations.

DrahtBot · 2024-11-24T15:59:21Z

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage & Benchmarks

For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/31363.

Reviews

See the guideline for information on the review process.

Type	Reviewers
ACK	instagibbs, ajtowns, ismaelsadeeq, glozow
Concept ACK	jonatack

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

Reviewers, this pull request conflicts with the following ones:

#30605 (Cluster linearization: separate tests from tests-of-tests by sipa)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

DrahtBot · 2024-11-24T16:20:29Z

🚧 At least one of the CI tasks failed.
_{Debug: https://github.com/bitcoin/bitcoin/runs/33443336095}

Hints

Try to run the tests locally, according to the documentation. However, a CI failure may still
happen due to a number of reasons, for example:

Possibly due to a silent merge conflict (the changes in this pull request being
incompatible with the current code in the target branch). If so, make sure to rebase on the latest
commit of the target branch.
A sanitizer issue, which can only be found by compiling with the sanitizer and running the
affected test.
An intermittent issue.

Leave a comment here, if you need help tracking down a confusing failure.

jonatack · 2024-12-05T14:22:51Z

Concept ACK

sipa · 2024-12-20T16:33:51Z

I'm aware this is a big chunk of code, but perhaps this can aid review a bit.

The biggest commits is txgraph: (feature) add initial version. To get an idea of what is going on, I think it's useful to look (in order) at:

The src/txgraph.h file added in that commit, as it defines the public interface.
The src/test/fuzz/txgraph.cpp test added in the next commit (txgraph: (tests) add simulation fuzz test), as it does to an extent show how the interface is used, and what is expected of it.
The data structures (TxGraphImpl and Cluster) in src/txgraph.cpp, and the internal functions corresponding to the 5 processing steps that happen:
- Batch removal of queued transaction removals: TxGraphImpl::ApplyRemovals and Cluster::ApplyRemovals.
- Splitting of clusters into components after removing transactions: TxGraphImpl::SplitAll, TxGraphImpl::Split, and Cluster::Split
- Computing which clusters will need to be combined due to queued dependencies being added between them: TxGraphImpl::GroupClusters.
- Merging of clusters and applying queued dependencies to them: TxGraphImpl::ApplyDependencies, TxGraphImpl::Merge, Cluster::Merge, Cluster::ApplyDependencies.
- Making the linearization of clusters acceptable: TxGraphImpl::MakeAcceptable and Cluster::Relinearize.
The interface implementation functions that are built on top in TxGraphImpl: AddTransaction, RemoveTransaction, AddDependency, SetTransactionFee, Cleanup, GetAncestors, GetDescendants, GetCluster, GetChunkFeerate, etc.

sipa · 2024-12-22T17:23:23Z

I have pushed a substantial change to the TxGraphImpl::GroupClusters function (in commit "txgraph: (feature) add initial version"). In a benchmark with 64000 transactions being combined into 1000 clusters, this makes it go from ~160ms to ~16ms. I will add the benchmark to this PR when I clean it up.

ismaelsadeeq

Concept ACK

GetChunkFeerate (get the mining score of a transaction)

From reading the description and skimming through the code, IIUC TxGraph encapsulates knowledge about fees and sizes but lacks knowledge about prioritization. This implies that the fee it uses is the effective fee, not the modified fee that includes prioritization.

How are we accounting for the real mining score of a Ref without that prioritization knowledge? The current mining algorithm uses the modified ancestor fee to calculate the mining score.

Could you clarify how prioritization will be handled post-cluster mempool?

sipa · 2024-12-24T19:41:16Z

@ismaelsadeeq TxGraph (and DepGraph, and FeeFrac) just treat "fee" and "size" as numbers whose ratio is to be maximized. These classes don't care what meaning they correspond to.

In practice, I expect that the mempool code will provide the modified fee as "fee", and vsize or weight as "size".

ismaelsadeeq · 2024-12-24T20:37:46Z

Thank you for clarifying. Reading 'lacks knowledge about prioritization' led me to infer that.
But I see now you are talking about mapDeltas

src/txgraph.h

src/cluster_linearize.h

sipa · 2025-01-09T21:49:44Z

A few changes:

Added a TxGraph::DoWork() functions which performs queued-up computations now, so that future operations are fast.
Make Ref& arguments to TxGraph::AddDependency, TxGraph::RemoveTransaction, and TxGraph::SetTransactionFee const. They do not modify the Ref itself; they only modify what the Ref points to.
Address review comments by @ismaelsadeeq here, and @theuni on cluster mempool: add TxGraph reorg functionality #31553.

This adds a function to query the chunk feerate of a transaction, by caching it inside the Entry objects.

…ature) Instead of leaving the responsibility on higher layers to guarantee that no connected component within TxGraph (a barely exposed concept, except through GetCluster()) exceeds the cluster count limit, move this responsibility to TxGraph itself: * TxGraph retains a cluster count limit, but it becomes configurable at construction time (this primarily helps with testing that it is properly enforced). * It is always allowed to perform mutators on TxGraph, even if they would cause the cluster count limit to be exceeded. Instead, TxGraph exposes an IsOversized() function, which queries whether it is in a special "oversize" state. * During oversize state, many inspectors are unavailable, but mutators remain valid, so the higher layer can "fix" the oversize state before continuing.

The m_deps_to_add vector is sorted by child Cluster*, which matches the order of an_clusters. This means we can walk through m_deps_to_add while doing the representative lookups for an_clusters, and reuse them.

…tion) Since m_deps_to_add has been sorted by child Cluster* already, all dependencies with the same child will be processed consecutively. Take advantage of this by remember the last partition merged with, and reusing that if applicable.

Chunk-based information (primarily, chunk feerates) are never accessed without first bringing the relevant Clusters to an "acceptable" quality level. Thus, while operations are ongoing and Clusters are not acceptable, we can omit computing the chunkings and chunk feerates for Clusters.

When transactions are removed from the tail of a cluster, we know the existing linearization remains acceptable (if it already was), but may just need splitting and postlinearization, so special case these into separate quality levels.

This is a preparation for a next commit where a TxGraph will start representing potentially two distinct graphs (a main one, and a staging one with proposed changes).

Move a number of related modifications to TxGraphImpl into a separate function for removal of transactions. This is preparation for a later commit where this will be useful in more than one place.

In order to make it easy to evaluate proposed changes to a TxGraph, introduce a "staging" mode, where mutators (AddTransaction, AddDependency, RemoveTransaction) do not modify the actual graph, but just a staging version of it. That staging graph can then be commited (replacing the main one with it), or aborted (discarding the staging).

Before this commit, if a TxGraph::Ref object is destroyed, it becomes impossible to refer to, but the actual corresponding transaction node in the TxGraph remains, and remains indefinitely as there is no way to remove it. Fix this by making the destruction of TxGraph::Ref trigger immediate removal of the corresponding transaction in TxGraph, both in main and staging if it exists.

In order to make it possible for higher layers to compare transaction quality (ordering within the implicit total ordering on the mempool), expose a comparison function and test it.

This can be called when the caller has time to spend now, and wants future operations to be fast.

This is a preparation for the next commit, which adds a feature to request the Refs to multiple ancestors/descendants at once.

Like GetAncestors and GetDescendants, but for the union of multiple inputs.

instagibbs · 2025-03-24T14:59:27Z

reACK b2ea365

nice doc improvements

git range-diff master 1601906941fa559ebbee7898453fa77f4606ad38 b2ea3656481b4196acaf6a1b5f3949a9ba4cf48f

sipa · 2025-03-25T19:24:21Z

It occurs to me that it may be possible to get rid of TxGraphImpl::Entry::m_main_lin_index, by storing a transaction linearization position rather than its DepGraphIndex inside the TxGraphImpl::Locator. So instead of using the DepGraphIndex as primary way to identify a transaction within a cluster, make the linearization index the primary, and look up the other one through m_linearization when needed. It's not clear to me if it's better to leave m_mapping be indexed by DepGraphIndex or linearization index, but I think it can be done.

Anyway, after looking into it, I don't think it's worth changing at this stage in the PR, but maybe it can be done as a follow-up.

ajtowns · 2025-03-26T18:25:25Z

reACK b2ea365

ismaelsadeeq

reACK b2ea365 🚀

glozow

ACK b2ea365

Reviewed the structure to convince myself this approach makes sense to support the features required, and reviewed the SimTxGraph + differential fuzzer in closer detail. I haven't read through all the TxGraphImpl functions but did some manual mutation testing.

src/txgraph.cpp

glozow · 2025-03-26T20:34:47Z

src/txgraph.cpp

+    // Update the linearization.
+    m_linearization = std::move(linearization);
+    // Update the Cluster's quality.
+    auto new_quality = optimal ? QualityLevel::OPTIMAL : QualityLevel::ACCEPTABLE;


fwiw, I don't think there is test coverage for this (i.e. fine if this is always set to optimal), though I expect any problems would become more obvious when mempool things are hooked up to txgraph

Right. The difference between OPTIMAL and ACCEPTABLE is not observable as far as the interface is concerned, because it only promises a best effort.

There is an operational difference, though: DoWork() will not bother putting more effort into OPTIMAL clusters.

glozow · 2025-03-26T21:08:55Z

src/test/fuzz/txgraph.cpp

+                    while (true) {
+                        auto old_component = component;
+                        for (auto i : component) {
+                            component |= sel_sim.graph.Ancestors(i);
+                            component |= sel_sim.graph.Descendants(i);
+                        }
+                        if (component == old_component) break;
+                    }


Why can't FindConnectedComponent be used instead?

That finds a connected component, but not the one that this particular transaction is in. I guess it could be refactored to support that too.

Done in #32151.

src/test/fuzz/txgraph.cpp

src/cluster_linearize.h

glozow · 2025-03-26T21:11:44Z

src/test/fuzz/txgraph.cpp

+                top_sim.AddDependency(par, chl);
+                real->AddDependency(*par, *chl);
+                break;
+            } else if (top_sim.removed.size() < 100 && command-- == 0) {


in 05abf33

I suppose this stops us from making removed larger than MAX_TRANSACTIONS + 100?

Where does the MAX_TRANSACTIONS come from? It should limit the size of removed to 100. This was added when the fuzzer ran for much longer (up to 1000 iterations, I think), and I wanted to avoid it just creating and removing transactions. It currently doesn't do much anymore with the lower iteration count.

I was thinking that if all the transactions are descended from this tx, you'd add all of them to to_remove here. So if you started out with 99, you can end up with a lot more than 100 afterward. You wouldn't do more removals afterward, but you'd have more than 100.

Ah, yes, indeed!

sipa mentioned this pull request Nov 24, 2024

Cluster mempool tracking issue #30289

Closed

33 tasks

sipa force-pushed the 202411_txgraph branch from bd3561b to eff6573 Compare November 24, 2024 16:20

DrahtBot added the CI failed label Nov 24, 2024

sipa force-pushed the 202411_txgraph branch 2 times, most recently from 1e432ca to 36a9db4 Compare November 24, 2024 18:46

DrahtBot removed the CI failed label Nov 24, 2024

DrahtBot mentioned this pull request Nov 25, 2024

Cluster mempool implementation #28676

Closed

8 tasks

sipa force-pushed the 202411_txgraph branch 3 times, most recently from 84fcb24 to a93de3b Compare November 25, 2024 22:35

glozow added the Mempool label Nov 28, 2024

sipa force-pushed the 202411_txgraph branch 2 times, most recently from c4b4e1f to 18184f4 Compare December 4, 2024 21:50

sipa mentioned this pull request Dec 8, 2024

cluster mempool: add txgraph diagrams/mining/eviction #31444

Merged

sipa force-pushed the 202411_txgraph branch from 18184f4 to 19cd5a8 Compare December 20, 2024 13:15

sipa force-pushed the 202411_txgraph branch from 19cd5a8 to 72d3ca1 Compare December 22, 2024 17:18

ismaelsadeeq reviewed Dec 24, 2024

View reviewed changes

sipa force-pushed the 202411_txgraph branch from 72d3ca1 to 56684bc Compare January 8, 2025 23:26

DrahtBot mentioned this pull request Jan 9, 2025

cluster mempool: add TxGraph reorg functionality #31553

Merged

ismaelsadeeq reviewed Jan 9, 2025

View reviewed changes

src/txgraph.h Outdated Show resolved Hide resolved

src/txgraph.h Outdated Show resolved Hide resolved

src/cluster_linearize.h Outdated Show resolved Hide resolved

sipa force-pushed the 202411_txgraph branch from 56684bc to dfa4d69 Compare January 9, 2025 21:33

sipa force-pushed the 202411_txgraph branch from dfa4d69 to 0da43c2 Compare January 16, 2025 21:11

sipa added 17 commits March 24, 2025 10:00

txgraph: Add GetChunkFeerate function (feature)

1d27b74

This adds a function to query the chunk feerate of a transaction, by caching it inside the Entry objects.

txgraph: Avoid representative lookup for each dependency (optimization)

1171953

The m_deps_to_add vector is sorted by child Cluster*, which matches the order of an_clusters. This means we can walk through m_deps_to_add while doing the representative lookups for an_clusters, and reuse them.

txgraph: Group per-graph data in ClusterSet (refactor)

34aa3da

This is a preparation for a next commit where a TxGraph will start representing potentially two distinct graphs (a main one, and a staging one with proposed changes).

txgraph: Abstract out ClearLocator (refactor)

c99c730

Move a number of related modifications to TxGraphImpl into a separate function for removal of transactions. This is preparation for a later commit where this will be useful in more than one place.

txgraph: Cache oversizedness of graphs (optimization)

6b037ce

txgraph: Allow Refs to outlive the TxGraph (feature)

22c68cd

txgraph: Expose ability to compare transactions (feature)

295a1ca

In order to make it possible for higher layers to compare transaction quality (ordering within the implicit total ordering on the mempool), expose a comparison function and test it.

txgraph: Add DoWork function (feature)

b685d32

This can be called when the caller has time to spend now, and wants future operations to be fast.

txgraph: Add CountDistinctClusters function (feature)

aded047

txgraph: Multiple inputs to Get{Ancestors,Descendant}Refs (preparation)

54bcedd

This is a preparation for the next commit, which adds a feature to request the Refs to multiple ancestors/descendants at once.

txgraph: Add Get{Ancestors,Descendants}Union functions (feature)

b2ea365

Like GetAncestors and GetDescendants, but for the union of multiple inputs.

sipa force-pushed the 202411_txgraph branch from 41b4434 to b2ea365 Compare March 24, 2025 14:06

DrahtBot requested a review from ismaelsadeeq March 24, 2025 14:59

ismaelsadeeq reviewed Mar 26, 2025

View reviewed changes

glozow reviewed Mar 26, 2025

View reviewed changes

sipa mentioned this pull request Mar 27, 2025

Follow-ups for txgraph #31363 #32151

Merged

vnprc mentioned this pull request Mar 27, 2025

Socratic Seminar 39 TriangleBitDevs/TriangleBitDevs.github.io#36

Closed

ismaelsadeeq mentioned this pull request Mar 31, 2025

RPC: getblockstats might not return the *effective* percentile fee rate #31140

Open

instagibbs mentioned this pull request Apr 1, 2025

TxGraph: Increase fuzz coverage #32177

Merged

kevkevinpal mentioned this pull request Apr 9, 2025

[April 10th Bitdevs] Suggested topics ChicagoBitdevs/ChicagoBitdevs.github.io#126

Closed

Conversation

sipa commented Nov 24, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

1. Overview

2. Interface

3. Implementation

4. Future work

Uh oh!

DrahtBot commented Nov 24, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Coverage & Benchmarks

Reviews

Conflicts

Uh oh!

DrahtBot commented Nov 24, 2024

Uh oh!

jonatack commented Dec 5, 2024

Uh oh!

sipa commented Dec 20, 2024

Uh oh!

sipa commented Dec 22, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ismaelsadeeq left a comment

Choose a reason for hiding this comment

Uh oh!

sipa commented Dec 24, 2024

Uh oh!

ismaelsadeeq commented Dec 24, 2024

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sipa commented Jan 9, 2025

Uh oh!

instagibbs commented Mar 24, 2025

Uh oh!

sipa commented Mar 25, 2025

Uh oh!

ajtowns commented Mar 26, 2025

Uh oh!

ismaelsadeeq left a comment

Choose a reason for hiding this comment

Uh oh!

glozow left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

10 participants

sipa commented Nov 24, 2024 •

edited

Loading

DrahtBot commented Nov 24, 2024 •

edited

Loading

sipa commented Dec 22, 2024 •

edited

Loading