-
Notifications
You must be signed in to change notification settings - Fork 913
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor channel state in lightningd #6628
Refactor channel state in lightningd #6628
Conversation
34044ef
to
f9ece98
Compare
ccb877d
to
659ed0f
Compare
Previously, an unexpected behavior in the ln helper function had gone unnoticed. However, while Rusty Russell was working on [1], he discovered a bug in lnprototest. Specifically, within the helper function that opens a channel, there was an incorrect way of counting the current block height. Ideally, this logic should be abstracted elsewhere. This commit addresses the block height counting error and resolves the bug identified in [1]. [1] ElementsProject/lightning#6628 Link: https://discord.com/channels/899980449231814676/941465665540325397/1154259454716563537 Reported-by: Rusty Russell <[email protected]> Signed-off-by: Vincenzo Palazzo <[email protected]>
Previously, an unexpected behavior in the ln helper function had gone unnoticed. However, while Rusty Russell was working on [1], he discovered a bug in lnprototest. Specifically, within the helper function that opens a channel, there was an incorrect way of counting the current block height. Ideally, this logic should be abstracted elsewhere. This commit addresses the block height counting error and resolves the bug identified in [1]. [1] ElementsProject/lightning#6628 Link: https://discord.com/channels/899980449231814676/941465665540325397/1154259454716563537 Reported-by: Rusty Russell <[email protected]> Signed-off-by: Vincenzo Palazzo <[email protected]>
Previously, an unexpected behavior in the ln helper function had gone unnoticed. However, while Rusty Russell was working on [1], he discovered a bug in lnprototest. Specifically, within the helper function that opens a channel, there was an incorrect way of counting the current block height. Ideally, this logic should be abstracted elsewhere. This commit addresses the block height counting error and resolves the bug identified in [1]. [1] ElementsProject/lightning#6628 Link: https://discord.com/channels/899980449231814676/941465665540325397/1154259454716563537 Reported-by: Rusty Russell <[email protected]> Signed-off-by: Vincenzo Palazzo <[email protected]>
Previously, an unexpected behavior in the ln helper function had gone unnoticed. However, while Rusty Russell was working on [1], he discovered a bug in lnprototest. Specifically, within the helper function that opens a channel, there was an incorrect way of counting the current block height. Ideally, this logic should be abstracted elsewhere. This commit addresses the block height counting error and resolves the bug identified in [1]. [1] ElementsProject/lightning#6628 Link: https://discord.com/channels/899980449231814676/941465665540325397/1154259454716563537 Reported-by: Rusty Russell <[email protected]> Signed-off-by: Vincenzo Palazzo <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some comments while I was review the PR to understand what was changed from before
In order to fix lnprototest you should build on top of #6702
The problem was that lnprototest was counting the current block heigh in the wrong way and this mean that core lightning was going out of sync (causing the reorg)
Oh sweet! A clean up of funding depth would be awesome! |
Reviewed the parts related to splicing, ACK 659ed0f While looking at this I've realized in splice confirmation, channel_control is setting channel->scid too early. A peer disconnection and restart of channeld here will cause channeld to boot up with the new scid before That's an existing problem not caused by this refactor -- though fixing it will cause merge conflicts 😅 Edit: I think this was easier to notice because of the cleaner code post refactor -- woo clean code! |
661ffbd
to
e4032d9
Compare
OK, this needed some rework: there was more to clean up! In particular, we used funding_depth_cb for both the main funding tx, as well as inflights (DF RBF attempts, splicing). This was deeply confusing, and caused weird issues until I understood it. I've now separated those out. |
db77fc9
to
c8d0e93
Compare
Signed-off-by: Rusty Russell <[email protected]>
Signed-off-by: Rusty Russell <[email protected]>
fd1d019
to
e49a2dc
Compare
This is actually a real issue (l1 doesn't see the warning before l2 drops the connection), but it's unrelated to this PR, and will require another one to fix. Signed-off-by: Rusty Russell <[email protected]>
… state. Signed-off-by: Rusty Russell <[email protected]>
Check the exact scids. Makes it simpler when failures occur. Signed-off-by: Rusty Russell <[email protected]>
Not just if htlc addition is too slow, make this the default. dual-open's txabort is excluded, however. Signed-off-by: Rusty Russell <[email protected]>
It's a mess right now. Try to express it as a switch() statement over the states we can be in. Signed-off-by: Rusty Russell <[email protected]>
This is a workaround, the real fix is to use a different callback for inflight splice attempts, which comes later. Signed-off-by: Rusty Russell <[email protected]>
Rename slightly, remove first arg, and make it a noop of there's no owner on channel. Signed-off-by: Rusty Russell <[email protected]>
Currently it's half done in funding_depth_cb, and half in channeld_tell_depth. It's very confusing as a result, with splicing, dual-funding and zeroconf. This does introduce a behaviour change: if a channel is NORMAL and it gets reorganized, we force close (unless we were the one who funded it, or it's zeroconf anyway). This is safer than continuing to use the channel in this case! Some tests are changed to zeroconf to make them work, but v2 doesn't support zeroconf, so that's removed. Signed-off-by: Rusty Russell <[email protected]>
Take an optional filter function, so callers can say exactly what they want. Signed-off-by: Rusty Russell <[email protected]>
We should use capability tests for states (can you add htlcs?) rather than vague descriptions (are you closing?). And as much as possible, use switch () statements to force us to think about all the cases, especially when we add new states! Signed-off-by: Rusty Russell <[email protected]>
Signed-off-by: Rusty Russell <[email protected]>
Signed-off-by: Rusty Russell <[email protected]>
…nnel_state_uncommitted. Signed-off-by: Rusty Russell <[email protected]>
We usually hand times by copy, not by pointer (and if we did, they should be const!). I noticed this particularly for the state changed code, but it goes down to to json_add_timeiso, so I fixed that too. Signed-off-by: Rusty Russell <[email protected]>
This is the variant of DUALOPEND_OPEN_INIT which you see once the channel is in the db: we'll be adding it next, but to reduce clutter the docs are added as a separate commit. Signed-off-by: Rusty Russell <[email protected]>
…ALOPEND_OPEN_COMMITTED. The latter is used when we're put in the db, the former is the uncommitted state. Currently dbid == 0 is used in addition to the state, which is unwieldy. Signed-off-by: Rusty Russell <[email protected]> Changelog-Experimental: JSON-RPC: added new dual-funding state `DUALOPEND_OPEN_COMMITTED`
It has the information we need, now. Signed-off-by: Rusty Russell <[email protected]>
It was a wrapper only used in one place anyway. Signed-off-by: Rusty Russell <[email protected]>
Don't assume the arg is a channel. Signed-off-by: Rusty Russell <[email protected]>
We never do this, but we're about to (we always watch before we broadcast a tx). We use a `depth` member to avoid calling the callback multiple times for the same event, but we initialize it to 0. This means if we register a watch, and the first thing that happens is that it reorganizes out, we *don't* make the callback. Use an impossible value at initialization, instead. Signed-off-by: Rusty Russell <[email protected]>
We use the *same* callback for the funding tx, as well as for inflight dual-funding txs, as well as inflight splice txs. This is deeply confusing! Instead, use explicit cbs for splicing and df. Once they're locked in, use the normal callback. Signed-off-by: Rusty Russell <[email protected]>
…ce scid. We used the original channel funding output number. I'm not sure if this was true in the previous code, or a regression I introduced, but it caused occasonal failures in test_splice_gossip! Signed-off-by: Rusty Russell <[email protected]>
Now we're not always using the same functions to watch during dual-funding opening, we need to make sure we're watching the close (in particular, df close before the opening is confirmed). So, keep a pointer, and if it's not set in drop_to_chain, set it. Signed-off-by: Rusty Russell <[email protected]>
…irms. Signed-off-by: Rusty Russell <[email protected]>
…g tx. We make dualopend_tell_depth static, which means we move it higher in the file. Signed-off-by: Rusty Russell <[email protected]>
e49a2dc
to
d333f19
Compare
It started with our handling of the funding depth callback, which needed cleaning up badly. But it lead me quickly to our channel state functions, which are unclear, especially as we added dual funding states and a splicing state. This makes the code cleaner and easier to update in future.
I stopped short of converting "uncommitted_channel" to a channel in a new state, but that is the next obvious step (this entire series was a side-quest as I was trying to clean up our channel_update handling!).