-
Notifications
You must be signed in to change notification settings - Fork 451
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Logging backpressure #584
Comments
Hive captures client logs by attaching to the container via Docker API. I think docker does not / cannot apply this buffering to the attached connection. I will look into it. |
Yes, it looks like the issue is that hive does not use the equivalent of 'docker logs' to get container logs. Instead, we attach to the container before it starts to capture all stdout/stderr. I think we could change to grabbing the logs asynchronously for client containers using a different API. |
Prototype implementation is here: #735 |
I cannot reproduce the original issue, my machine is too fast. I use this command to test:
The testnet participation is 0.97 and the PR makes no difference on the result. |
Hello
I'm trying to run Hive Testsuite with high log level,
DEBUG
orTRACE
and see huge performance degradation compared toINFO
level. Not only the participation level drops from 0.97 to 0.90-0.93 but also client fails to reply to the requests from Testsuite to RPC API in 5 seconds timeout. As lower logging level works well with the same setup and logs for unresponsive container are stopped few seconds before the failure, I guess that the reason behind that is logging back pressure. Docker log output to stdout and stderr is blocking by default and when it unables to flush it in time, container stalls and resumes only after output is fetched (see docs). It's a highly repeatable issue for me with high log levels.I've tried to set docker logging to "non-blocking" mode but it doesn't help. Maybe the reason is elsewhere or I'm doing something wrong by patching this:
Debugging tests with higher log levels looks valuable, but with blocking logging test fails early.
Could you, please, help to make it working?
The text was updated successfully, but these errors were encountered: