-
Notifications
You must be signed in to change notification settings - Fork 913
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Recover plugin to automate recovery in CLN #6853
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the general approach, but we also need to have an idea when gossipd thinks it is synced (probably in getinfo). If we are on regtest, I would simply consider us always up-to-date, but for mainnet you could use some heuristic like "more than 1000 channels and the seeker is in state NORMAL" maybe?
39504f8
to
e55230a
Compare
e55230a
to
76c0394
Compare
76c0394
to
2fb8e6b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good so far! Minor nitpicks...
7e3a4b4
to
b42e775
Compare
da38b09
to
6cbf8f5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like it, but test fails for some reason?
tests/test_misc.py
Outdated
|
||
# l2.daemon.wait_for_log(r'All outputs resolved.*') | ||
wait_for(lambda: l2.rpc.listfunds()["channels"][0]["state"] == "ONCHAIN") | ||
wait_for(lambda: l2.rpc.listfunds()["channels"][1]["state"] == "ONCHAIN") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CI failing here: seems like l2 only has one channel?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It passes locally, but fails here for some reason. I am trying to resolve it...
52b73bf
to
ad35815
Compare
ad35815
to
e910f4c
Compare
Remvoed milestone as this has a crash bug, and we need to move on with the release process. Happy to re-add it if fixed in the next RC. |
da547bc
to
df36ca9
Compare
The CI error is fixed here #7080 |
df36ca9
to
7ba8a2d
Compare
Rebased on top of |
…'d help us identify if we've fall behind or lost some state.
… and try to recover the node by entering mode.
…ected nodes on the network and call emergency_recover immediately.
…orage and then call restorefrompeer repeatedly.
7ba8a2d
to
89eadcd
Compare
… being recovered when we lose some state and enter recover mode.
89eadcd
to
b296e3f
Compare
@adi2011 notice that if you force-push we have to run all the tests again, whereas if we just restart we can just rerun the failing cases. Let me take care of CI and this will be merged soon. |
Finally!! 🎉🎉❤️ |
This would help users detect that they've lost some data and would automatically try to recover from SCB or peer storage.
Read #6544 for more!