-
Notifications
You must be signed in to change notification settings - Fork 912
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Increase fd limits (and warn as we approach them) #7237
Increase fd limits (and warn as we approach them) #7237
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK adbf9bc
4640693
to
f4c76a4
Compare
f4c76a4
to
f858d34
Compare
Added the connecting map to the memleak detection callback in connectd. Also included a driveby fix for tracepoints memory initialization. |
Fixing up test assumption that hard limit > soft limit (on CI it seems to be 65536 for both!). |
f858d34
to
e663658
Compare
This works fine for me locally, but the CI seems unable to raise the fd limit after pytest constrains it. I noticed if I run pytest as root, it exhibits the same behavior. |
would be nice to test this in docker, kubernetes, and the like to make sure it doesn't have the same constraints |
Agreeed. Meanwhile, this doesn't make it worse if we can't increase fd limits, and at least now we'll get a warning if it seems limiting. |
So we can log when we hit fd limits on accept/recvmsg. Signed-off-by: Rusty Russell <[email protected]>
This can happen if we're totally out of fds, but previously we gave no log message indicating this! Signed-off-by: Rusty Russell <[email protected]>
Signed-off-by: Rusty Russell <[email protected]>
Signed-off-by: Rusty Russell <[email protected]>
I thought I was going to want to have a convenient way of counting these, but it turns out unnecessary. Still, this is slightly more efficient and simple, so I am including it. Signed-off-by: Rusty Russell <[email protected]>
We use a crude heuristic: if we were trying to contact them, it's a "deliberate" connection, and should be preserved. Changelog-Changed: connectd: prioritize peers with channels (and log!) if we run low on file descriptors. Signed-off-by: Rusty Russell <[email protected]>
… channels. 1024 is a common limit, and people are starting to hit that many channels, so we should increase it: twice the number of channels seems reasonable, though we only do this at restart time. Changelog-Changed: lightningd: we now try to increase the number of file descriptors, if it's less than twice the number of channels at startup (and log if we cannot!). Signed-off-by: Rusty Russell <[email protected]>
They're cheap. The 2x channels heuristic is nice, but does assume they restart every so often. If someone hits 64k connections I would like to know anyway! Signed-off-by: Rusty Russell <[email protected]>
e663658
to
995fefd
Compare
We expect: UNUSUAL.*WARNING: we have 1 channels but can file descriptors limited to 65536 We get: lightningd: WARNING: we have 1 channels but can file descriptors limited to 32768! This is strange, since the first line is from Python's hard limit. Presumably something is restricting the fd limit of children Signed-off-by: Rusty Russell <[email protected]>
995fefd
to
71c039e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK 71c039e
Passing the github CI should also give us some clue about the limits in VMs?
On many systems, the default fd limit is 1024, and we've got users with over 900 channels. But that's a soft limit, and you can simply ask for an increase!
Also, we should log when we are running low, so people don't have to ask me what to do!