-
Notifications
You must be signed in to change notification settings - Fork 71
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bpftune doesn't seem to notice its' hit an external limit when attempting to increase TCP buffers #93
Comments
thanks for the detailed report, it's really helpful! On the core issue, definitely bpftune should notice, but the key question is what to do in the general case. In your particular case, 2gb seems like a good max limit, but in cases where rmem_max is too low, should we bump it too? I'd like to have better mechanisms to put the brakes on runaway increases, so any ideas are most welcome! currently we look for correlations between buffer size increases and RTT increases as a signal that we're buffering too much, but something more sophisticated would be good here. i'll give it some thought and do some experiments at my end and post when I have something. Again, thanks for filing this! |
well, maybe.. we all know that everything has a downside. i’m not sure what other implications or repercussions of setting the default allocation per connection higher…. do we hit some ‘max amount of address space’ we can allocate or reference in a single operation, causing what could previously be performed in one cycle to now take two or more? (idk if that’s a real thing, jus theorizing around the possible ways that bigger mightn’t be better :) ) we’re talking at least two problems here. so i’ll separate them:
as that link references, 2gb is the largest that rmem_max may be set to. unless something changes in the kernel, this is a boundary that we can’t go further than. does bpftune have a table of similar core truths and value maximum caps that it references as ‘we can’t go past 10… err …11’ :) now in this case we should certainly g/a and adjust rmem_max and wmem_max to their upper bounds… and then continue adjusting tcp.rmem upwards
i would think that ‘i tried to make this same adjustment but the operation didn’t take for some reason’ aught be noticed… do we decide ‘this knob is not having the effect we had hoped’ and look for other options? do we abstract further, and have a ‘local observation ratchet mechanism’ and a wider lens ‘desired end result solver’ mechanism to call to help suggest next plan of action when we get into headbanging loops like this? ie the ‘I NEED AN ADULT!’ call to help suggest alternate adjustment strategies seeing as we got stuck here? either way, if i failed to make this adjustment twice in a row, the third fail aight likely tickle some sort of back off or signal that ‘We’re doin’ it Wrong!™️’ as the current ‘bang wall harder with head’ method is actually hurting performance, if nothing else than from the sheer volume of log messages being generated. maybe we have an acceptable ‘adjustments here per ~10s period’ number that we employ to concurrently allow for quick ratcheting behavior while still also providing some safety net to prevent suboptimal adjustment loops like this how do we assess what buffers, queues, modules, parameters, and external influences are related to the state we’re in so as to evaluate what else to change? fun things to ponder certainly |
sorry, rereading this I misunderstood the original issue. It's not that tcp [rw]mem exceeds net.core limits; it's that we can't exceed a 2g limit in the parameter setting. I've pushed a potential fix #97 to the main branch - if you could test at your end that would be great. thanks! |
first, want to say that BPFTune is super cool.
This is really good stuff. and I'm thankful for the efforts put forth at autotuning.
I think a good companion tool to this would leverage a sibling host, to help identify optimal interface settings wrt nic buffers, MTU, mss, etc... but that's a tangent of a different flavor.
I have observed a peculiar behavior with bpftune on my proxmox hosts that I suspect others may hit as well.
Ceph wants to schep lots of data around. and so bpftune is trying to increase the tcp buffer accordingly.
...but it seems unaware of the 2g limit to rmem_max and keeps hitting a wall attempting to increase rmem max beyond 2gb:
This is 1 second of logs about this ;)
This makes sense, as
net.ipv4.rmem_max
/net.ipv4.wmem_max
cap at 2g-1This link isn't REALLY relevant, except that it points out the 2g cap
So... I suppose I'd expect that bpftune would be aware that this is (seemingly?) a hard limit.. or identify that it's hitting some limitation, and perhaps intelligently trying to find a maximal value, rather than simply trying to increease by 25%?
thoughts?
again, thanks for making this... it's slick.
The text was updated successfully, but these errors were encountered: