Skip to content

p2p: Erlay support signaling#23443

Merged
glozow merged 9 commits intobitcoin:masterfrom
naumenkogs:2021-11-erlay1
Oct 17, 2022
Merged

p2p: Erlay support signaling#23443
glozow merged 9 commits intobitcoin:masterfrom
naumenkogs:2021-11-erlay1

Conversation

@naumenkogs
Copy link
Contributor

@naumenkogs naumenkogs commented Nov 5, 2021

This is a part of the Erlay project:


This PR adds a new p2p message sendtxrcncl signaling for reconciliation support.
Before sending that message, a node is supposed to "pre-register" the peer by generating and storing an associated reconciliation salt component.
Once the salts are exchanged within this new message, nodes "register" each other for future reconciliations by computing and storing the aggregate salt, along with the reconciliation parameters based on the connection direction.

@DrahtBot
Copy link
Contributor

DrahtBot commented Nov 5, 2021

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #26151 (refactor: Guard TxRequestTracker by its own lock instead of cs_main by dergoegge)
  • #25957 (wallet: fast rescan with BIP157 block filters for descriptor wallets by theStack)
  • #25572 (refactor: Introduce EvictionManager and use it for the inbound eviction logic by dergoegge)
  • #24545 (BIP324: Enable v2 P2P encrypted transport by dhruv)
  • #23233 (BIP324: Add encrypted p2p transport {de}serializer by dhruv)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

Coverage

Coverage Change (pull 23443, 4dc7bac65e41a1f2bbbae084b8bbaeee70cec700) Reference (master, ed4eeaf)
Lines +0.0276 % 83.7655 %
Functions +0.0294 % 80.7326 %
Branches +0.0133 % 51.3151 %

Updated at: 2022-05-10T14:16:22.523123.

Copy link

@ariard ariard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed until 85992f3. So far only minor comments.

Copy link

@ariard ariard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finished a first parse, including verifying unit test coverage of the new TxReconciliationTracker class. LGTM

@rebroad
Copy link
Contributor

rebroad commented Nov 25, 2021

txreconciliation.cpp:43:20: warning: private field 'm_k0' is not used [-Wunused-private-field]
    const uint64_t m_k0, m_k1;
                   ^
txreconciliation.cpp:43:26: warning: private field 'm_k1' is not used [-Wunused-private-field]
    const uint64_t m_k0, m_k1;

Copy link
Member

@maflcko maflcko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left some nits

@maflcko
Copy link
Member

maflcko commented Nov 29, 2021

Run process_message with args ['/tmp/cirrus-ci-build/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/src/test/fuzz/fuzz', '-runs=1', '/tmp/cirrus-ci-build/ci/scratch/qa-assets/fuzz_seed_corpus/process_message']fuzz: test/fuzz/process_message.cpp:57: auto initialize_process_message()::(anonymous class)::operator()() const: Assertion "GetNumMsgTypes() == getAllNetMessageTypes().size()" && check' failed.

@naumenkogs naumenkogs force-pushed the 2021-11-erlay1 branch 2 times, most recently from cdaabda to ea3b87a Compare December 2, 2021 11:38
Copy link
Contributor

@mzumsande mzumsande left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you think about a -disablerecon command or something similar, to not activate Erlay until all the parts have been merged?
Providing the opportunity to opt in/out would probably make sense anyway, but it would also make the review process of its parts independent from releases - e.g. I also don't think it makes sense to have nodes running master exchange SENDRECON before Erlay has been fully merged.

@naumenkogs
Copy link
Contributor Author

Did you think about a -disablerecon command or something similar, to not activate Erlay until all the parts have been merged?

Yeah I was also thinking about how it's weird to send SENDRECON messages while they don't mean anything, but couldn't come up with any good solution.
For starters, I could comment out sending the message at least I guess.

I guess a flag is better if someone wants to test this stuff via functional tests (without modifying the source code), so yeah, that's probably an ultimate solution. I'll add a commit for that.

Copy link
Member

@glozow glozow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK, thanks for splitting it up. I did a first pass, mostly comparing to the BIP330 specification, looking at test coverage, and a little bit of implementation.

@GeneFerneau
Copy link

utACK ea3b87a

Code review, everything looks good to me outside nits raised by others.

Will run the fuzzer for some cycles, and report back results.

Copy link
Contributor

@stickies-v stickies-v left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approach ACK, code review until 103e52d - didn't look at tests yet.

This PR introduces the logic to register newly connected peers as peers that support transaction reconciliation, and initialize the necessary data structures required to do the reconciliation later on.

I think everything is well laid out and implemented, so most of my comments are quite nitty and focus on readability etc. They're subjective, feel free to ignore.

Some general thoughts/suggestions:

  • "SENDRECON" does not hint that it corresponds to transactions, would it make sense to rename this to "SENDTXRECON" so people can quickly understand what it's about without know the Erlay code?
  • I understand you're renaming the two roles to "initiator" & "responder", there are still quite a few instances of "requestor", "sender", "receiver" throughout the code and in BIP300. Would suggest to replace them all in both documentation and variable names, consistent naming makes it much easier to follow.

Copy link
Contributor

@jnewbery jnewbery left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you think about a -disablerecon command or something similar, to not activate Erlay until all the parts have been merged?

I agree that this should be added, although to bikeshed I'd call it -erlay and default to false.

@naumenkogs naumenkogs force-pushed the 2021-11-erlay1 branch 2 times, most recently from 5136f50 to 07b9357 Compare December 10, 2021 14:14
@naumenkogs
Copy link
Contributor Author

Still going through comments, beware it's not the final shape from my side.

Copy link
Contributor

@vasild vasild left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK e91690e

Some non showstoppers below.

argsman.AddArg("-seednode=<ip>", "Connect to a node to retrieve peer addresses, and disconnect. This option can be specified multiple times to connect to multiple nodes.", ArgsManager::ALLOW_ANY, OptionsCategory::CONNECTION);
argsman.AddArg("-networkactive", "Enable all P2P network activity (default: 1). Can be changed by the setnetworkactive RPC command", ArgsManager::ALLOW_ANY, OptionsCategory::CONNECTION);
argsman.AddArg("-timeout=<n>", strprintf("Specify socket connection timeout in milliseconds. If an initial attempt to connect is unsuccessful after this amount of time, drop it (minimum: 1, default: %d)", DEFAULT_CONNECT_TIMEOUT), ArgsManager::ALLOW_ANY, OptionsCategory::CONNECTION);
argsman.AddArg("-txreconciliation", strprintf("Enable transaction reconciliations per BIP 330 (default: %d)", DEFAULT_TXRECONCILIATION_ENABLE), ArgsManager::ALLOW_ANY | ArgsManager::DEBUG_ONLY, OptionsCategory::CONNECTION);
Copy link
Contributor

@vasild vasild Oct 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: "per BIP 330" looks too opaque for this help text, it is too long and technical to read the BIP itself in order to understand what this option is about.

Suggested change
argsman.AddArg("-txreconciliation", strprintf("Enable transaction reconciliations per BIP 330 (default: %d)", DEFAULT_TXRECONCILIATION_ENABLE), ArgsManager::ALLOW_ANY | ArgsManager::DEBUG_ONLY, OptionsCategory::CONNECTION);
argsman.AddArg("-txreconciliation", strprintf("Enable transaction reconciliations for more efficient transaction propagation (default: %d)", DEFAULT_TXRECONCILIATION_ENABLE), ArgsManager::ALLOW_ANY | ArgsManager::DEBUG_ONLY, OptionsCategory::CONNECTION);

There are other BIP xyz mentions in the help texts, feel free to ignore.


// We do this exactly once per peer (which are unique by NodeId, see GetNewNodeId) so it's
// safe to assume we don't have this record yet.
Assert(m_states.emplace(peer_id, local_salt).second);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can remove this assert() from above:

assert(m_states.find(peer_id) == m_states.end());

They both check for the same thing. That way we will do one lookup instead of two, maybe insignificant wrt performance but will be less code.

PS why use assert() in one place and Assert() in another? Did you intend to use Assume()?

class TxReconciliationTracker::Impl
{
private:
mutable Mutex m_txreconciliation_mutex;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: "txreconciliation" is already in the name of the class TxReconciliationTracker. This can be just m_mutex.

For example, another member is called m_states, not m_txreconciliation_states.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was the initial version, but then @jonatack asked for the more verbose name, and I didn't care much... Maybe he has to say something.

Copy link
Member

@jonatack jonatack Oct 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The rationale is that m_txreconciliation_mutex returns more useful results than m_mutex when git grepping the codebase, and the more specific mutex naming seems to be a convention in the codebase as much as the generic naming.

return;
}

if (!peer->GetTxRelay()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the code and the comments disagree:

if (!peer->GetTxRelay()) {
    // Disconnect peers that send a SENDTXRCNCL message even though we indicated we don't
    // support transaction relay.

We set txrelay (thus GetTxRelay() will be true) on this condition:

!pfrom.IsBlockOnlyConn() && (fRelay || (peer->m_our_services & NODE_BLOOM)) // fRelay comes from the peer

but we indicate transaction relay support on this condition:

!m_ignore_incoming_txs && !pnode.IsBlockOnlyConn() && !pnode.IsFeelerConn();

They are not the same.

Also noted by @jonatack here.

if (recon_state == m_states.end()) return NOT_FOUND;
uint64_t* local_salt = std::get_if<uint64_t>(&recon_state->second);
// A peer is already registered. This should be checked by the caller.
Assume(local_salt);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be assert() since we dereference local_salt below. It will crash anyway if it is nullptr. Better crash with a meaningful message produced by the assert().


BOOST_AUTO_TEST_CASE(RegisterPeerTest)
{
TxReconciliationTracker tracker(1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use TXRECONCILIATION_VERSION instead of 1?

tracker.PreRegisterPeer(0);

// Both roles are false, don't register.
BOOST_CHECK(tracker.RegisterPeer(/*peer_id=*/0, /*is_peer_inbound=*/true,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here and elsewhere, BOOST_CHECK_EQUAL(x, y) is preferred over BOOST_CHECK(x == y) because the former gives a better message should the check fail.

@vasild
Copy link
Contributor

vasild commented Oct 10, 2022

In the OP:

  • the link to BIP 330 can be updated and this can be removed: "Note this is not what's in the bitcoin/bips repo, but an updated version".
  • s/sendrecon/sendtxrcncl

If we're connecting to the peer which might support
transaction reconciliation, we announce we want to reconcile
with them.

We store the reconciliation salt so that when the peer
responds with their salt, we are able to compute the
full reconciliation salt.

This behavior is enabled with a CLI flag.
Once we received a reconciliation announcement support
message from a peer and it doesn't violate our protocol,
we store the negotiated parameters which will be used
for future reconciliations.
We optimistically pre-register a peer for txreconciliations
upon sending txreconciliation support announcement.
But if, at VERACK, we realize that the peer never sent
WTXIDRELAY message, we should unregister the peer
from txreconciliations, because txreconciliations rely on wtxids.
@glozow
Copy link
Member

glozow commented Oct 17, 2022

Looks like rebased for #25667.
Pinging @sipa @dergoegge @vasild @ariard @mzumsande for re-ACK

Copy link
Contributor

@vasild vasild left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK e56d1d2

@vasild
Copy link
Contributor

vasild commented Oct 17, 2022

The newly added test p2p_sendtxrcncl.py failed just on "Win64 native" 😭

@dergoegge
Copy link
Member

re-ACK e56d1d2

@sipa
Copy link
Member

sipa commented Oct 17, 2022

re-ACK e56d1d2. No differences with a rebase of previously reviewed e91690e.

@glozow
Copy link
Member

glozow commented Oct 17, 2022

The newly added test p2p_sendtxrcncl.py failed just on "Win64 native" sob

Job: https://cirrus-ci.com/task/6088863331385344
Reran and it turned green. Maybe coincidental timeout?

@maflcko
Copy link
Member

maflcko commented Oct 17, 2022

The reason was likely

node0 2022-10-17T10:37:19.451252Z [C:\Users\ContainerAdministrator\AppData\Local\Temp\cirrus-ci-build\src\net_processing.cpp:3460] [ProcessMessage] [net] wtxidrelay received after verack from peer=8; disconnecting

https://cirrus-ci.com/task/6088863331385344?logs=functional_tests#L605

However, I can't see how this could possibly happen in the framework. (peer8 is a python node)

@mzumsande
Copy link
Contributor

re-ACK e56d1d2

@naumenkogs
Copy link
Contributor Author

Thank you all for making this happen! I will create a PR for fixups, and finish the second PR shortly.

@ariard
Copy link

ariard commented Oct 17, 2022

Post-Merge ACK e91690e

// - we intended to exchange transactions over this connection while establishing it
// and the peer indicated support for transaction relay in the VERSION message;
// - we are not in -blocksonly mode.
if (pfrom.m_relays_txs && !m_ignore_incoming_txs) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this going to send SENDTXRCNCL to feeler connections?

argsman.AddArg("-seednode=<ip>", "Connect to a node to retrieve peer addresses, and disconnect. This option can be specified multiple times to connect to multiple nodes.", ArgsManager::ALLOW_ANY, OptionsCategory::CONNECTION);
argsman.AddArg("-networkactive", "Enable all P2P network activity (default: 1). Can be changed by the setnetworkactive RPC command", ArgsManager::ALLOW_ANY, OptionsCategory::CONNECTION);
argsman.AddArg("-timeout=<n>", strprintf("Specify socket connection timeout in milliseconds. If an initial attempt to connect is unsuccessful after this amount of time, drop it (minimum: 1, default: %d)", DEFAULT_CONNECT_TIMEOUT), ArgsManager::ALLOW_ANY, OptionsCategory::CONNECTION);
argsman.AddArg("-txreconciliation", strprintf("Enable transaction reconciliations per BIP 330 (default: %d)", DEFAULT_TXRECONCILIATION_ENABLE), ArgsManager::ALLOW_ANY | ArgsManager::DEBUG_ONLY, OptionsCategory::CONNECTION);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably unrelated, but it might be good to split the connection section into net and net_processing options. (This one would go into net_processing)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.