Correct `msgs_sent_since_pong` tracking for gossip forwards by TheBlueMatt · Pull Request #4093 · lightningdevkit/rust-lightning

TheBlueMatt · 2025-09-19T01:46:53Z

When we forward gossip messages, we use the already-encoded copy we have, pushing it onto the gossip_broadcast_buffer. When we have free socket buffer space, and there are no non-gossip messages pending, we'll remove it from the gossip_broadcast_buffer and push it onto the normal pending_outbound_buffer socket queue.

Separately, we use msgs_sent_since_pong to ensure that our peer is being responsive and our messages are getting through with minimal latency. After we send 32 messages (if we've already received the pong for the last ping wensent), we send an extra ping and will only queue up an additional 32 forwarded gossip messages before we start dropping forwarded gossip on the floor.

Previously, we (arguably) incorrectly incremented
msgs_sent_since_pong when we pushed a message onto the gossip_broadcast_buffer, not when it was actually queued up in our pending_outbound_buffer immediate socket queue. This is fairly strange - the point of msgs_sent_since_pong is to keep peer latency low, we already independently enforce memory limits to ensure we drop forwarded gossip messages if a peer's message queues have grown too large. This means we potentially disconnect a peer for not draining the backlog forwarded-gossip socket queue fast enough, rather than simply dropping gossip.

While this shouldn't be a big deal in practice, I do see (mostly Tor) peers disconnect on my node occasionally, and while its likely due to Tor circuits hanging and a disconnect is needed, the incorrect tracking here makes state analysis difficult as there are nearly always a flow of enough gossip messages to "send to the peer". Its also possible that this allows incredibly low bandwidth connections to stay afloat more durably.

When we forward gossip messages, we use the already-encoded copy we have, pushing it onto the `gossip_broadcast_buffer`. When we have free socket buffer space, and there are no non-gossip messages pending, we'll remove it from the `gossip_broadcast_buffer` and push it onto the normal `pending_outbound_buffer` socket queue. Separately, we use `msgs_sent_since_pong` to ensure that our peer is being responsive and our messages are getting through with minimal latency. After we send 32 messages (if we've already received the `pong` for the last `ping` wensent), we send an extra `ping` and will only queue up an additional 32 forwarded gossip messages before we start dropping forwarded gossip on the floor. Previously, we (arguably) incorrectly incremented `msgs_sent_since_pong` when we pushed a message onto the `gossip_broadcast_buffer`, not when it was actually queued up in our `pending_outbound_buffer` immediate socket queue. This is fairly strange - the point of `msgs_sent_since_pong` is to keep peer latency low, we already independently enforce memory limits to ensure we drop forwarded gossip messages if a peer's message queues have grown too large. This means we potentially disconnect a peer for not draining the backlog forwarded-gossip socket queue fast enough, rather than simply dropping gossip. While this shouldn't be a big deal in practice, I do see (mostly Tor) peers disconnect on my node occasionally, and while its likely due to Tor circuits hanging and a disconnect is needed, the incorrect tracking here makes state analysis difficult as there are nearly always a flow of enough gossip messages to "send to the peer". Its also possible that this allows incredibly low bandwidth connections to stay afloat more durably.

ldk-reviews-bot · 2025-09-19T01:46:56Z

👋 I see @tankyleo was un-assigned.
If you'd like another reviewer assignment, please click here.

lightning/src/ln/peer_handler.rs

codecov · 2025-09-19T02:00:15Z

Codecov Report

❌ Patch coverage is 42.85714% with 4 lines in your changes missing coverage. Please review.
✅ Project coverage is 88.60%. Comparing base (50391d3) to head (686a586).
⚠️ Report is 6 commits behind head on main.

Files with missing lines	Patch %	Lines
lightning/src/ln/peer_handler.rs	42.85%	4 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #4093      +/-   ##
==========================================
- Coverage   88.61%   88.60%   -0.02%     
==========================================
  Files         176      176              
  Lines      132099   132120      +21     
  Branches   132099   132120      +21     
==========================================
+ Hits       117060   117065       +5     
- Misses      12365    12386      +21     
+ Partials     2674     2669       -5

Flag	Coverage Δ
fuzzing	`21.92% <0.00%> (-0.01%)`	⬇️
tests	`88.44% <42.85%> (-0.02%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

tnull

Makes sense, simple enough, landing.

Do you happen to recall why we originally incremented it when adding to the gossip_broadcast_buffer?

lightning/src/ln/peer_handler.rs

TheBlueMatt · 2025-09-19T12:34:46Z

Do you happen to recall why we originally incremented it when adding to the gossip_broadcast_buffer?

I do not. I assume it was just convenient, the message is now in a "message queue" so it looks kinda like we should increment the "messages queued" counter.

graphite-app bot reviewed Sep 19, 2025

View reviewed changes

lightning/src/ln/peer_handler.rs Show resolved Hide resolved

ldk-reviews-bot requested a review from tankyleo September 19, 2025 01:57

tnull self-requested a review September 19, 2025 06:56

tnull approved these changes Sep 19, 2025

View reviewed changes

lightning/src/ln/peer_handler.rs Show resolved Hide resolved

tnull merged commit 0cb7f54 into lightningdevkit:main Sep 19, 2025
24 of 25 checks passed

tankyleo removed their request for review October 10, 2025 18:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Correct `msgs_sent_since_pong` tracking for gossip forwards#4093

Correct `msgs_sent_since_pong` tracking for gossip forwards#4093
tnull merged 1 commit intolightningdevkit:mainfrom
TheBlueMatt:2025-09-dont-disconnect-slow-gossipers

TheBlueMatt commented Sep 19, 2025

Uh oh!

ldk-reviews-bot commented Sep 19, 2025 •

edited

Loading

Uh oh!

Uh oh!

codecov bot commented Sep 19, 2025 •

edited

Loading

Uh oh!

tnull left a comment

Uh oh!

Uh oh!

Uh oh!

TheBlueMatt commented Sep 19, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

TheBlueMatt commented Sep 19, 2025

Uh oh!

ldk-reviews-bot commented Sep 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

codecov bot commented Sep 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

tnull left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

TheBlueMatt commented Sep 19, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ldk-reviews-bot commented Sep 19, 2025 •

edited

Loading

codecov bot commented Sep 19, 2025 •

edited

Loading