net: Add new permission `forceinbound` to evict a random unprotected connection if all slots are otherwise full by pinheadmz · Pull Request #27600 · bitcoin/bitcoin

pinheadmz · 2023-05-08T19:59:30Z

Use case: I run a full node that accepts inbound connections and have a whitebind setting so my personal light client can always connect, even when maxconnections (and particularly all inbound slots) is already full.

Currently when connections are full, if we receive an inbound peer request, we look for a current connection to evict so the new peer can have a slot. To find an evict-able peer we go through all our peers and "protect" multiple categories of peers, then we evict the "worst" peer that is left unprotected. If there are no peers left to evict, the inbound connection is denied.

With this PR, if the inbound connection has forceinbound permission we start the eviction process by first protecting all noban and outbound connections, then selecting one of the remaining peers (if any) at random. Then we loop through all our current connections, removing protected peers from the evict-able list. If we end up protecting all our remaining connections, the randomly chosen peer is evicted.

forceinbound implies noban permission.

All outbound and noban connections remain protected from eviction.

DrahtBot · 2023-05-08T19:59:33Z

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage

For detailed information about the code coverage, see the test coverage report.

Reviews

See the guideline for information on the review process.

Type	Reviewers
Concept ACK	mzumsande
Stale ACK	LarryRuane, stickies-v, willcl-ark

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

Reviewers, this pull request conflicts with the following ones:

#28463 (p2p: Increase inbound capacity for block-relay only connections by mzumsande)
#27581 (net: Continuous ASMap health check by fjahr)
#27114 (p2p: Allow whitelisting outgoing connections by brunoerg)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

stickies-v

Concept ACK, makes sense to prioritize whitelisted peers. Will review more in-depth soon.

src/node/eviction.cpp

src/net.cpp

src/node/eviction.cpp

src/net.h

mzumsande

Concept ACK

As I suggested in the review club, an alternative, more simple approach would be to just pick a random peer after removing NoBan and outbound connections when in force-mode. Then, if at the end of the usual eviction algorithm, we don't have a evict-able peer, we would evict the random one instead.
This would make the code simpler (no need to change EraseLastKElements or keep track of last), and I don't really see a major downside:

The EraseLastKElements eviction criteria don't really seem to be sorted by priority anyway right now (because order doesn't matter so far), so evicting a peer that would've been protected by later calls doesn't seem better than evicting one that would've been protected by earlier ones.
Whitebind No-Ban peers are more trusted to begin with, but if they maliciously wanted to take over all inbound slots by repeatedly connecting to us on the whitebind address, they can do that anyway, whether the eviction is random or whether the current approach of the PR is used. So it's not clear to me what the extra benefit of the current approach over random eviction is.

pinheadmz · 2023-05-17T20:18:58Z

@stickies-v @mzumsande thanks for your review and feedback, I refactored the PR so we evict a random peer when forced if all other peers are protected. It is MUCH simpler ;-)

stickies-v

Light approach ACK c826187b5070ce89edcde0536183714a1c2e3207

I think this random selection approach suggested by mzumsande is a more elegant approach. "Light" in approach ACK because I need to build more confidence that this doesn't introduce new attack vectors.

src/node/eviction.cpp

src/node/eviction.h

src/net.h

pinheadmz · 2023-05-18T14:19:18Z

Thanks @stickies-v nits addressed! 🙏

stickies-v

Code review ACK 96b513f605fb2df441b66da583056b1c8acd4dbc, but I think the first commit should be removed from this PR, I think it's an accident?

I think the code is good and suits the intent of the PR well. My only concern is about the potential footgun introduced, even if it is relatively mild. Anyone running bitcoind with whitebind=0.0.0.0:<port> will now be vulnerable to having all their inbound non-whitelisted slots taken over by an attacker that has figured out what the whitelisted port is. It doesn't seem like a huge concern, given that:

it only affects inbound connections, which we already assume to be more susceptible to these kinds of attacks, and
it does require the user to manually whitebind to 0.0.0.0

but I just wanted to flag it here anyway in case it is actually more serious than I see it to be.

Also, this seems like behaviour change that would benefit from a mention in the release notes?

stickies-v · 2023-05-22T14:54:27Z

src/node/eviction.cpp

nit: if vEvictionCandidates is empty, there's no point (I think, couldn't see any side effects in EraseLastKElements or ProtectEvictionCandidatesByRatio) executing the next lines so might as well return early? Unless you think this makes the code less clear?

Suggested change

std::optional<NodeId> force_evict;

if (vEvictionCandidates.size() > 0 && force) {

if (vEvictionCandidates.empty()) return std::nullopt;

std::optional<NodeId> force_evict;

if (force) {

Good point, although "if empty, return now" logic would make sense after each step in this function (why bother calling EraseLastKElements() with an empty array?) even without this PR. One thing we could do is add that logic to the beginning of EraseLastKElements() itself, but I dunno how much time is really wasted in there (sort, min, erase ... ?)

I had similar considerations to e.g. wrap the EraseLastKElements call in a lambda fn, but it's probably not worth the LoC change.

I think this case is slightly different in that it better highlights that if there are no inbound NoBan peers, the result is always std::nullopt.

good point, i'll add it. one line i think in this case will improve readability

stickies-v · 2023-05-22T18:09:35Z

test/functional/p2p_eviction.py

As you mentioned in the review club, whitelisting just based on ports enables an attacker that discovered which port(s) are whitelisted to take over all (non-whitelisted) inbound connections slots.

Therefore, I think we have to be careful not to accidentally lead people to this footgun. Binding to 127.0.0.1 seems much more prudent to not set a bad example.

Suggested change

self.restart_node(0, extra_args=['-maxconnections=12', '-whitebind=0.0.0.0:30201'])

self.restart_node(0, extra_args=['-maxconnections=12', '-whitebind=127.0.0.1:30201'])

updated test and also included warning in release note

stickies-v · 2023-05-24T21:52:08Z

doc/release-notes-27600.md

I think "on local ports" is unnecessarily restrictive/scary. I think it's still completely fine to whitelist remote addresses, you mostly just want to avoid ranges, right?

@mzumsande what's your opinion on this? Is this PR a big enough change to warrant this kind of warning?

Not really sure, I've never looked into these options much and don't know about best practices - I think that if you use -whitebind with non-local addresses, you'd at least have to make sure that that address is not self-advertised. I guess that this applies more to -whitebind than -whitelist, so that the preferred approach in case of a non-local connection would be to use -whitelist?
fyi @vasild, do you have an opinion on this?

That's a good point, whitelist is harder to attack than whitebind because the attacker would have to spoof their origin IP repeatedly to fill up your inbounds. If a user whitebind's a port and an attacker figures out that port numnber, they can trivially evict all your other inbounds

I agree with what you say above - whitebind on publicly accessible address:port with this new permission sounds bad. Similar with whitelist - if a range is used.

This PR currently expands the semantic of the noban permission. This will affect existent setups that already use it. Would it make sense to introduce a new permisson, separate from noban? I mean - now if somebody is running -whitebind=noban@publicaddr:port then a bad actor could cause harm on master, but even more harm with this PR.

interesting idea, so we could add NetPermissionFlags::NoBanForce or something. The danger will still be present for users of the permission, but existing users of NoBan wouldn't have to worry. And since NoBan is a default of whitebind it'll require more user attention anyway to use the more dangerous option.

In that case, the release note should still warn the user about using NoBanForce (or whatever we call it) but that warning can just be general, like, keep an eye on your netinfo if you do this. As opposed to just recommending only setting local addresses with whitebind

@vasild

This PR currently expands the semantic of the noban permission

Could you please elaborate on this point? I thought that with this PR, noban nodes are still completely protected (they are removed from the eviction candidates list before the random node is chosen). But it's certain that I'm missing something. Thanks.

@LarryRuane you are correct but what we are changing (in the current state of this branch) is that noban nodes can now force disconnection of other peers. Since many users may already have noban set, maybe even for a large range of IPs, this branch would introduce a new vulnerability they may not be aware of. Because of that I think it does make more sense to specify a new permission so users can narrow the attack surface

LarryRuane · 2023-06-10T16:52:56Z

Please consider editing the description slightly, if my understanding is correct.

With this PR, if the inbound connection is on our whitelist we start the eviction process by selecting a random unprotected peer.

At this point in the process, where we choose a random peer, we're considering all (inbound, not noban) peers, is that correct? They are all (other than outbound or noban) "unprotected" because we haven't protected any of them yet. So maybe this should say, "... selecting a random peer" or "... selecting a random inbound peer" -- because we may randomly choose a peer that ends up being protected, or not protected. That would make this PR easier for me to understand.

I do think the updated algorithm is very elegant (great suggestion, @mzumsande). It chooses a random peer, which may end up being erased from the eviction candidate list, or may not, but we don't care either way; we evict it if we need to, all the better if it actually did not get protected (but, again, we don't care).

By the way, I'm going to highlight this PR in the Optech newsletter next week, that's why I'm particularly interested in making sure I understand it.

LarryRuane

utACK fa78fc543cd46ee1c7181ecf6b696c2892b9f8d3
except for possible attack vectors (for example, NoBanForce) that I'm not really qualified to comment on. I may try to contribute to that discussion after I've learned more. Nice PR!

src/node/eviction.cpp

LarryRuane · 2023-06-10T17:47:19Z

doc/release-notes-27600.md

@vasild

This PR currently expands the semantic of the noban permission

Could you please elaborate on this point? I thought that with this PR, noban nodes are still completely protected (they are removed from the eviction candidates list before the random node is chosen). But it's certain that I'm missing something. Thanks.

LarryRuane · 2023-06-10T17:53:44Z

src/node/eviction.cpp

Perhaps insert a comment before this line // This may still return std::nullopt

I'd suggest expanding this comment to explain what exactly happened.

LarryRuane · 2023-06-10T18:17:02Z

src/node/eviction.h

I'm not a fan of default arguments; consider making this not have a default. It makes sense if there are many existing calls to a function and you don't want to touch them all, but there's only one call to this function in production code. You would have to touch about 6 call sites in test code, but that's not too bad. The reason I'm not a fan of default arguments is that they hide something that may be helpful to see when just looking at the call site. (You might think, "Oh, it's possible that eviction can be 'forced', what does that mean?) But this is an optional suggestion, fine if you leave it as-is, just something to consider.

ok I like this, will do

This still remains default, no?

LarryRuane · 2023-06-10T20:00:02Z

Even though not needed for this PR, you may want to consider adding a commit to remove the unnecessary templating. It's perfectly reasonable not to make this change in this PR, but I thought I'd document it just in case; I do think it makes the code easier to understand.

--- a/src/node/eviction.cpp
+++ b/src/node/eviction.cpp
@@ -74,9 +74,10 @@ struct CompareNodeNetworkTime {
 };
 
 //! Sort an array by the specified comparator, then erase the last K elements where predicate is true.
-template <typename T, typename Comparator>
 static void EraseLastKElements(
-    std::vector<T>& elements, Comparator comparator, size_t k,
+    std::vector<NodeEvictionCandidate>& elements,
+    std::function<bool(const NodeEvictionCandidate&, const NodeEvictionCandidate&)> comparator,
+    size_t k,
     std::function<bool(const NodeEvictionCandidate&)> predicate = [](const NodeEvictionCandidate& n) { return true; })
 {
     std::sort(elements.begin(), elements.end(), comparator);

pinheadmz · 2023-06-10T20:57:57Z

@LarryRuane thanks for reviewing. I actually had removed the templating in the very first version of this branch but after changing the approach I just left it alone. The function is written like a generic utility function, even though we don't currently use it for any other vectors. So I think the function could go as written into a util.cpp or something, in a future PR ?

pinheadmz · 2023-09-26T17:36:12Z

@willcl-ark thanks for reviewing, addressed your feedback except for the new test but always open to writing more tests so lemme know if you think something is uncovered there.

pinheadmz · 2023-09-27T20:07:41Z

Rebasing on master hopefully to fix Windows CI failure

pinheadmz · 2023-10-16T14:33:53Z

Rebased on master again, thanks @willcl-ark for your review. Hoping to get some more feedback from @mzumsande and @naumenkogs to proceed, or abandon

mzumsande

Hoping to get some more feedback from @mzumsande and @naumenkogs to proceed, or abandon

Personally, I'm not sure if the use case (running a node with -maxconnections < 40 and at the same time wanting certain inbound peers, or wanting a really high number of whitelisted inbounds peers at the same time) is common enough to introduce an extra permission just for that. In more typical cases, noban should be sufficient.
But if others think it is I'm not against it.

A related use case I could see (but I'm also not sure if there is demand for that) is to only be reachable by whitelisted "friends" but no one else. That doesn't seem to be possible right now, -fListen is either all or none.

naumenkogs · 2023-11-08T09:54:55Z

@pinheadmz Honestly I still struggle to understand what are the practical scenarios of using this. E.g., what could lead to not being able to evict a connection?

pinheadmz · 2023-11-08T16:30:32Z

@naumenkogs If the user has also set a low maxconnections value then SelectNodeToEvict() might return null.

Quick math 4+8+4+8+4 = 28 protected nodes. Plus 8 full outbound and 2 block only = 38. So if a user runs a full node on limited hardware (like my Raspberry Pi at home) and have something like -maxconnections=30 then even with the existing whitebind permissions, their privileged node may fail to connect.

That being said, it ALSO means that the solution to #8798 may be to just ensure maxconnections is set > 40. It's been 7 years since that issue was open and I can check with my own (modern) hardware that 40 connections is sane.

naumenkogs · 2023-11-09T07:40:26Z

src/node/eviction.cpp

c5e7e6563201965e88e07aa8da8f26e17fc1b4db

"unprotected" here is sloppy. I'd just drop the comment.

naumenkogs · 2023-11-09T07:42:39Z

src/node/eviction.cpp

I'd suggest expanding this comment to explain what exactly happened.

naumenkogs · 2023-11-09T07:43:41Z

src/node/eviction.h

This still remains default, no?

naumenkogs · 2023-11-09T07:44:42Z

src/node/eviction.h

c5e7e6563201965e88e07aa8da8f26e17fc1b4db

"among inbound no-noban connections" or something (i hate no-noban but not sure what's better)

done, thanks

naumenkogs · 2023-11-09T07:53:36Z

src/net.cpp

311902f2cf94ee0e11ed34a1373db42dbc218b20

No need to use legacy naming patterns for new variables :)

hmmm I'm gonna keep it actually because later in this function is a boolean called forced so, I think this is legible

naumenkogs · 2023-11-09T07:57:11Z

src/net.cpp

311902f2cf94ee0e11ed34a1373db42dbc218b20

should this be exposed in peer info?

good idea, added a new commit for this

Accomplished by adding a bool argument `force` to SelectNodeToEvict()

Only inbound nodes with this permission set will call `SelectNodeToEvict()` with force=true, so when connections are full there is an increased liklihood of opening a slot for the new inbound. Extends NoBan permission.

… full

pinheadmz · 2023-11-09T16:55:24Z

Thanks for the review Gleb, I also rebased on master to fix conflicts

naumenkogs

Not sure if you missed the two comments:
one
two

naumenkogs · 2023-11-10T07:45:33Z

src/net.h

    TransportProtocolType m_transport_type;
    /** BIP324 session id string in hex, if any. */
    std::string m_session_id;
+    /** whether this peer forced its connection by evicting another */


8c20268

So forced actually means two things to you:

Before connecting — something that might evict another peer

After connecting — something that has evicted another peer

I think this is confusing and better to use different words. RPC can expose "force_evicted" flag (to cover the latter case and apply to the Connection), while "forceInbound" can stay in the network/potential-connection context (the former case).

Apparently, in the previous commit 7586802 you assign forced = true; even for non-forceInbound peers.

So basically if a node had 8 connections which are non-forceInbound but still did eviction, a real forceInbound connection won't be able to join (see how you count nForced). Is that intended? Seems like a bug to me.

DrahtBot · 2023-12-06T18:20:10Z

🐙 This pull request conflicts with the target branch and needs rebase.

sr-gi · 2024-01-04T20:55:09Z

I feel like this makes sense conceptually, but I have similar concerns to @stickies-v @mzumsande and @naumenkogs with it.

If we could approach this without the need to introduce an additional permission I'd be happier, since it seems a big change for a narrow use case with a potential workaround by just accepting more than 38 total connections (plus N for the nobans). I also wonder how this plays out with #27114, given this would be an in only permission.

Given this only triggers under really specific conditions, can we not just prioritize our whitelisted peer under those conditions? The addition of new whitelisted peers can take priority over the ones we protect for "diversity criteria" as long as we do not have enough candidates to evict a peer, that is: after protecting the current noban and outbound peers, if the number of remaining peers is smaller than the number of peers we will protect for diversity criteria, we just pick one of the remaining at random and remove them. All this limited to a maximum as you are already doing with MAX_FORCED_INBOUND_CONNECTIONS.

For this to trigger, your number of noban connections needs to be too high, and/or, your max number of connections is too low. If we are concerned that this could backfire, as @vasild mentioned, we could even gatekeep it under a global flag so it does not affect any of the existing noban connections.

DrahtBot · 2024-04-02T00:02:09Z

⌛ There hasn't been much activity lately and the patch still needs rebase. What is the status here?

Is it still relevant? ➡️ Please solve the conflicts to make it ready for review and to ensure the CI passes.
Is it no longer relevant? ➡️ Please close.
Did the author lose interest or time to work on this? ➡️ Please close it and mark it 'Up for grabs' with the label, so that it can be picked up in the future.

pinheadmz mentioned this pull request May 8, 2023

whiteconnections should be re-added #8798

Open

pinheadmz changed the title ~~Make peer eviction slightly more aggresive to make room for whitelisted inbound connections~~ Allow inbound whitebind connections to more aggresivey evict peers when slots are full May 8, 2023

This was referenced May 9, 2023

refactor: Introduce EvictionManager and use it for the inbound eviction logic #25572

Closed

refactor: Introduce EvictionManager #25268

Closed

DrahtBot changed the title ~~Allow inbound whitebind connections to more aggresivey evict peers when slots are full~~ net: Allow inbound whitebind connections to more aggresivey evict peers when slots are full May 9, 2023

DrahtBot added the P2P label May 9, 2023

stickies-v reviewed May 11, 2023

View reviewed changes

src/node/eviction.cpp Outdated Show resolved Hide resolved

pinheadmz changed the title ~~net: Allow inbound whitebind connections to more aggresivey evict peers when slots are full~~ net: Allow inbound whitebind connections to more aggressively evict peers when slots are full May 12, 2023

stickies-v reviewed May 16, 2023

View reviewed changes

src/node/eviction.cpp Outdated Show resolved Hide resolved

src/node/eviction.cpp Outdated Show resolved Hide resolved

src/node/eviction.cpp Outdated Show resolved Hide resolved

src/net.cpp Show resolved Hide resolved

src/node/eviction.cpp Outdated Show resolved Hide resolved

stickies-v reviewed May 16, 2023

View reviewed changes

src/net.h Show resolved Hide resolved

mzumsande reviewed May 17, 2023

View reviewed changes

pinheadmz force-pushed the whitebind-evict branch from e71d495 to c826187 Compare May 17, 2023 20:18

stickies-v reviewed May 17, 2023

View reviewed changes

pinheadmz force-pushed the whitebind-evict branch from c826187 to 2ab1ed6 Compare May 18, 2023 14:18

pinheadmz force-pushed the whitebind-evict branch from 2ab1ed6 to 96b513f Compare May 18, 2023 14:24

DrahtBot added the CI failed label May 18, 2023

stickies-v reviewed May 22, 2023

View reviewed changes

pinheadmz force-pushed the whitebind-evict branch from 96b513f to fa78fc5 Compare May 24, 2023 17:17

pinheadmz requested a review from stickies-v May 24, 2023 19:14

stickies-v reviewed May 24, 2023

View reviewed changes

DrahtBot removed the CI failed label May 25, 2023

LarryRuane reviewed Jun 10, 2023

View reviewed changes

DrahtBot requested a review from stickies-v June 10, 2023 18:20

DrahtBot requested review from LarryRuane and removed request for LarryRuane September 26, 2023 14:19

pinheadmz force-pushed the whitebind-evict branch from 8585fe3 to 8639a45 Compare September 26, 2023 17:35

pinheadmz force-pushed the whitebind-evict branch from 8639a45 to 32f48ef Compare September 26, 2023 20:09

DrahtBot mentioned this pull request Sep 30, 2023

BIP324 integration #28331

Merged

DrahtBot mentioned this pull request Oct 19, 2023

refactor: Split per-peer parts of net module into new node/connection module #28686

Closed

mzumsande reviewed Nov 3, 2023

View reviewed changes

naumenkogs reviewed Nov 9, 2023

View reviewed changes

pinheadmz added 6 commits November 9, 2023 11:33

eviction: track one random unprotected node to evict if forced

0c0f2a2

Accomplished by adding a bool argument `force` to SelectNodeToEvict()

net: add new permission ForceInbound

99399b3

Only inbound nodes with this permission set will call `SelectNodeToEvict()` with force=true, so when connections are full there is an increased liklihood of opening a slot for the new inbound. Extends NoBan permission.

net: nodes with ForceInbound permission force eviction

8bc2030

test: cover ForceInbound permission success even when connections are…

6b6bcaf

… full

doc: add release note for bitcoin#27600

e4b860a

net: only allow 8 simultaneous forced inbound connections

7586802

net: add forced_inbound to getpeerinfo

8c20268

DrahtBot mentioned this pull request Nov 9, 2023

Multiprocess bitcoin #10102

Draft

naumenkogs reviewed Nov 10, 2023

View reviewed changes

-    std::optional<NodeId> force_evict;
-    if (vEvictionCandidates.size() > 0 && force) {
+    if (vEvictionCandidates.empty()) return std::nullopt;
+    std::optional<NodeId> force_evict;
+    if (force) {

	self.restart_node(0, extra_args=['-maxconnections=12', '-whitebind=0.0.0.0:30201'])
	self.restart_node(0, extra_args=['-maxconnections=12', '-whitebind=127.0.0.1:30201'])

Conversation

pinheadmz commented May 8, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DrahtBot commented May 8, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Coverage

Reviews

Conflicts

Uh oh!

stickies-v left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mzumsande left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pinheadmz commented May 17, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

stickies-v left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

pinheadmz commented May 18, 2023

Uh oh!

stickies-v left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

LarryRuane commented Jun 10, 2023

Uh oh!

LarryRuane left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pinheadmz commented May 8, 2023 •

edited

Loading

DrahtBot commented May 8, 2023 •

edited

Loading

mzumsande left a comment •

edited

Loading

pinheadmz commented May 17, 2023 •

edited

Loading

pinheadmz commented Nov 8, 2023 •

edited

Loading