From 01cd605cb1d526461779b7d5e3d1119b74e8f48e Mon Sep 17 00:00:00 2001 From: Rusty Russell Date: Tue, 9 Jul 2024 11:22:40 +0930 Subject: [PATCH] connectd: fix missing peer close. We were getting the following message in test_feerate_stress: ``` 2024-07-08T02:15:45.5663941Z lightningd-2 2024-07-08T02:13:45.696Z **BROKEN** 0266e4598d1d3c415f572a8488830b60f7e744ed9235eb0b1ba93283b315c03518-connectd: Peer did not close, forcing close ``` I can reproduce it locally if I run the test enough, and finally found the issue by printing the status of the fd when we time it out (using routines from connectd.c). The peer fd alternates between reading and writing. When we go to discard it, we wake the write queue, so write_to_peer() get called. It won't shutdown the socket if there are still subds attached, and will wait again for a read. The last subd exit has to also wake the write queue if we're draining, so it can do the io_sock_shutdown. Otherwise, we hit the timeout, causing the message above. Signed-off-by: Rusty Russell --- connectd/multiplex.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/connectd/multiplex.c b/connectd/multiplex.c index 81d5a52b0912..aadde8ba7330 100644 --- a/connectd/multiplex.c +++ b/connectd/multiplex.c @@ -1032,6 +1032,11 @@ static void destroy_subd(struct subd *subd) * have been waiting for write_to_subd) */ io_wake(&peer->peer_in); + /* If this is the last subd, and we're draining, wake outgoing + * now (it will start shutdown). */ + if (tal_count(peer->subds) == 0 && peer->to_peer && peer->draining) + msg_wake(peer->peer_outq); + /* Maybe we were last subd out? */ maybe_free_peer(peer); }