Skip to content

htlc_wire: fix crash when adding an HTLC#8502

Merged
rustyrussell merged 2 commits intoElementsProject:masterfrom
Lagrang3:hot-fix-patch
Aug 27, 2025
Merged

htlc_wire: fix crash when adding an HTLC#8502
rustyrussell merged 2 commits intoElementsProject:masterfrom
Lagrang3:hot-fix-patch

Conversation

@Lagrang3
Copy link
Collaborator

In line channeld/channeld_wiregen.c:832 *added+i is not a tal object hence the instruction in common/htlc_wire.c:200 tal_arr(ctx, struct tlv_field, 0); crashes CLN. This is fixed by stating that added_htlc is a a varsize_type.

Logs:

2025-08-16T02:25:28.640Z **BROKEN** lightningd: FATAL SIGNAL 6 (version v25.05-200-g79b959b)V ...
2025-08-16T02:25:28.640Z **BROKEN** lightningd: backtrace: ccan/ccan/tal/tal.c:95 (call_error) 0x54f6bc 
2025-08-16T02:25:28.640Z **BROKEN** lightningd: backtrace: ccan/ccan/tal/tal.c:169 (check_bounds) 0x54f75a
2025-08-16T02:25:28.640Z **BROKEN** lightningd: backtrace: ccan/ccan/tal/tal.c:178 (to_tal_hdr) 0x54f782 
2025-08-16T02:25:28.640Z **BROKEN** lightningd: backtrace: ccan/ccan/tal/tal.c:193 (to_tal_hdr_or_null) 0x54f7c7 
2025-08-16T02:25:28.640Z **BROKEN** lightningd: backtrace: ccan/ccan/tal/tal.c:471 (tal_alloc_) 0x54ffe4 
2025-08-16T02:25:28.640Z **BROKEN** lightningd: backtrace: ccan/ccan/tal/tal.c:517 (tal_alloc_arr_) 0x5500c4 
2025-08-16T02:25:28.640Z **BROKEN** lightningd: backtrace: common/htlc_wire.c:200 (fromwire_len_and_tlvstream) 0x48d63d 
2025-08-16T02:25:28.640Z **BROKEN** lightningd: backtrace: common/htlc_wire.c:234 (fromwire_added_htlc) 0x48dd23 
2025-08-16T02:25:28.640Z **BROKEN** lightningd: backtrace: channeld/channeld_wiregen.c:832 (fromwire_channeld_got_commitsig) 0x4c61fa 
2025-08-16T02:25:28.640Z **BROKEN** lightningd: backtrace: lightningd/peer_htlcs.c:2377 (peer_got_commitsig) 0x4549cb 
2025-08-16T02:25:28.640Z **BROKEN** lightningd: backtrace: lightningd/channel_control.c:1552 (channel_msg) 0x4140fe 
2025-08-16T02:25:28.640Z **BROKEN** lightningd: backtrace: lightningd/subd.c:560 (sd_msg_read) 0x461513 
2025-08-16T02:25:28.640Z **BROKEN** lightningd: backtrace: ccan/ccan/io/io.c:60 (next_plan) 0x544885 
2025-08-16T02:25:28.640Z **BROKEN** lightningd: backtrace: ccan/ccan/io/io.c:422 (do_plan) 0x544cea 
2025-08-16T02:25:28.640Z **BROKEN** lightningd: backtrace: ccan/ccan/io/io.c:439 (io_ready) 0x544d9d 
2025-08-16T02:25:28.640Z **BROKEN** lightningd: backtrace: ccan/ccan/io/poll.c:455 (io_loop) 0x54665d 
2025-08-16T02:25:28.640Z **BROKEN** lightningd: backtrace: lightningd/io_loop_with_timers.c:22 (io_loop_with_timers) 0x42d220 
2025-08-16T02:25:28.640Z **BROKEN** lightningd: backtrace: lightningd/lightningd.c:1487 (main) 0x43280f

gdb inspection:

830             *added = num_added ? tal_arr(ctx, struct added_htlc, num_added) : NULL;
831             for (size_t i = 0; i < num_added; i++)
832                     fromwire_added_htlc(&cursor, &plen, *added + i);
(gdb) p i
$3 = 1

Fixes issue #8461.

Comment on lines +1594 to +1595
a->extra_tlvs = tal_dup_talarr(a, struct tlv_field,
htlc->extra_tlvs);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewers: please check this line.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line is unnecessary. We assumed before (and can still assume) that the lifetime of the htlc is greater than the lifetime of this added marshalling, so simply share a pointer.

If we needed this, we would need to copy the contents of the tlv too: the value field in particular. We would normally do this by implementing tlv_field_arr_dup().

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we needed this, we would need to copy the contents of the tlv too: the value field in particular. We would normally do this by implementing tlv_field_arr_dup().

Thanks. I thought this might be the case.

I'll share the pointer.

@nepet
Copy link
Member

nepet commented Aug 26, 2025

I had the exact same PR ready as well 😆 so I approve these changes.

It would be really nice if we had a way to reproduce #8461 in a test to check if this really solves the issue and to avoid a regression in the future.

@Lagrang3
Copy link
Collaborator Author

Lagrang3 commented Aug 26, 2025

Well. It happened on my node. And I wasn't making any payments.
It might have been during routing through my node.
I'll check the logs to see if I can find something useful to reproduce this.

From the code it seems to me that we need to receive more than one added htlc from the wire AND have extra tlvs on those.
Let's do that for homework.
But probably @rustyrussell knows.

@grubles
Copy link
Contributor

grubles commented Aug 26, 2025

Well. It happened on my node. And I wasn't making any payments. It might have been during routing through my node. I'll check the logs to see if I can find something useful to reproduce this.

From the code it seems to me that we need to receive more than one added htlc from the wire AND have extra tlvs on those. Let's do that for homework. But probably @rustyrussell knows.

Yeah in my case it happened during routing, not when trying to send a payment.

@nepet
Copy link
Member

nepet commented Aug 26, 2025

From the code it seems to me that we need to receive more than one added htlc from the wire AND have extra tlvs on those.

If this is the case we can use sendpay and the test plugin htlc_accepted-customtlv.py to send and intercept some htlcs and add an extra_tlv to them:

          ( intercepts htlc )
          ( adds extra tv. )
node_1 -------> node_2 -------> node_3 ---> node_4

@rustyrussell rustyrussell added this to the v25.09 milestone Aug 27, 2025
@rustyrussell
Copy link
Contributor

OK. We need to add two htlcs, the second of which contains tlvs.

I've added a test, and neatened @Lagrang3's commit slightly:

  1. We don't need to copy extra_tlvs, we can assume lifetime of htlc is greater.
  2. There was a gratuitous formatting change which I removed

@rustyrussell rustyrussell enabled auto-merge (rebase) August 27, 2025 01:29
rustyrussell and others added 2 commits August 27, 2025 12:57
Reported-by: grubles
Signed-off-by: Rusty Russell <[email protected]>
In line channeld/channeld_wiregen.c:832 `*added+i` is not a tal object hence
the instruction in common/htlc_wire.c:200 `tal_arr(ctx, struct tlv_field, 0);` crashes CLN.
This is fixed by stating that added_htlc is a a varsize_type.

Logs:

2025-08-16T02:25:28.640Z **BROKEN** lightningd: FATAL SIGNAL 6 (version v25.05-200-g79b959b)V
...
2025-08-16T02:25:28.640Z **BROKEN** lightningd: backtrace: ccan/ccan/tal/tal.c:95 (call_error) 0x54f6bc
2025-08-16T02:25:28.640Z **BROKEN** lightningd: backtrace: ccan/ccan/tal/tal.c:169 (check_bounds) 0x54f75a
2025-08-16T02:25:28.640Z **BROKEN** lightningd: backtrace: ccan/ccan/tal/tal.c:178 (to_tal_hdr) 0x54f782
2025-08-16T02:25:28.640Z **BROKEN** lightningd: backtrace: ccan/ccan/tal/tal.c:193 (to_tal_hdr_or_null) 0x54f7c7
2025-08-16T02:25:28.640Z **BROKEN** lightningd: backtrace: ccan/ccan/tal/tal.c:471 (tal_alloc_) 0x54ffe4
2025-08-16T02:25:28.640Z **BROKEN** lightningd: backtrace: ccan/ccan/tal/tal.c:517 (tal_alloc_arr_) 0x5500c4
2025-08-16T02:25:28.640Z **BROKEN** lightningd: backtrace: common/htlc_wire.c:200 (fromwire_len_and_tlvstream) 0x48d63d
2025-08-16T02:25:28.640Z **BROKEN** lightningd: backtrace: common/htlc_wire.c:234 (fromwire_added_htlc) 0x48dd23
2025-08-16T02:25:28.640Z **BROKEN** lightningd: backtrace: channeld/channeld_wiregen.c:832 (fromwire_channeld_got_commitsig) 0x4c61fa
2025-08-16T02:25:28.640Z **BROKEN** lightningd: backtrace: lightningd/peer_htlcs.c:2377 (peer_got_commitsig) 0x4549cb
2025-08-16T02:25:28.640Z **BROKEN** lightningd: backtrace: lightningd/channel_control.c:1552 (channel_msg) 0x4140fe
2025-08-16T02:25:28.640Z **BROKEN** lightningd: backtrace: lightningd/subd.c:560 (sd_msg_read) 0x461513
2025-08-16T02:25:28.640Z **BROKEN** lightningd: backtrace: ccan/ccan/io/io.c:60 (next_plan) 0x544885
2025-08-16T02:25:28.640Z **BROKEN** lightningd: backtrace: ccan/ccan/io/io.c:422 (do_plan) 0x544cea
2025-08-16T02:25:28.640Z **BROKEN** lightningd: backtrace: ccan/ccan/io/io.c:439 (io_ready) 0x544d9d
2025-08-16T02:25:28.640Z **BROKEN** lightningd: backtrace: ccan/ccan/io/poll.c:455 (io_loop) 0x54665d
2025-08-16T02:25:28.640Z **BROKEN** lightningd: backtrace: lightningd/io_loop_with_timers.c:22 (io_loop_with_timers) 0x42d220
2025-08-16T02:25:28.640Z **BROKEN** lightningd: backtrace: lightningd/lightningd.c:1487 (main) 0x43280f

gdb inspection:
830             *added = num_added ? tal_arr(ctx, struct added_htlc, num_added) : NULL;
831             for (size_t i = 0; i < num_added; i++)
832                     fromwire_added_htlc(&cursor, &plen, *added + i);
(gdb) p i
$3 = 1

Changelog-None: crash introduced this release.
Signed-off-by: Lagrang3 <[email protected]>
[ Added test, removed Changelog --RR ]
@rustyrussell rustyrussell merged commit 7e5cf41 into ElementsProject:master Aug 27, 2025
39 checks passed
@Lagrang3 Lagrang3 deleted the hot-fix-patch branch August 27, 2025 06:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants