-
Notifications
You must be signed in to change notification settings - Fork 701
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: File descriptor leaks for eventfd and timerfd_create #1158
Conversation
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## master #1158 +/- ##
==========================================
+ Coverage 77.92% 78.11% +0.19%
==========================================
Files 53 53
Lines 6990 7074 +84
==========================================
+ Hits 5447 5526 +79
- Misses 1543 1548 +5
☔ View full report in Codecov by Sentry. |
ABI check has failed due to the destructor. I would appreciate any help with it. Also, some clangs check failed, but it looks like the failures aren't related to the PR. |
Probably related: #44 |
Hey @tyler92. Thanks for your PR. Two requests:
|
@kiplingw Done. I've implemented a simple check of the Valgrind report
I see the test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @tyler92, thank you for this patch. I'm not (yet) familiar with this part of the codebase, so I can't say for sure if this is the right fix for the problem, but it looks like a sensible solution to me.
I'd appreciate if someone with some experience with these kinds of file descriptors would give some approval here, but if not, I think we can merge this anyway. If some issue is discovered, this patch can always be reverted and reworked.
I was going to say the same thing. It seems fine, but it's an area of the code base I'm less familiar with. @dennisjenkins75 thoughts? |
@kiplingw @dennisjenkins75 Hi. Do we have any chance to review/merge it? |
I'm fine with it. But while we wait for @dennisjenkins75's feedback, I'd suggest bumping the patch version in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs to be updated in lieu of the most recent merged PR.
d0eef78
to
f148290
Compare
Done |
Hey @tyler92. Thank you for your PR. While we await @dennisjenkins75, I can probably anticipate at least some of his feedback: I think it would be prudent if you were able to add a unit test somehow to verify the leak and the fix. You might have to be creative by monitoring |
Oh, yeah, thanks. For some reason, I lost my unit test while rebasing my branch. Please find the unit test in the latest push. The test failed without the patch, and it's passed with it. |
tests/http_server_test.cc
Outdated
} | ||
else | ||
{ | ||
std::cout << "NOTE: Please use Valgrind with '--track-fds=yes' option for this test" << std::endl; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tyler92, I am a bit concerned about this. Does the user need to run the unit test in valgrind(1) in order to verify that it passed? The reason I ask is it would be cumbersome for a user to have to manually do that, as opposed to the build environment automatically doing that. If it can't do that, then if the test fails we'll never know during CI.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with you. It's related to the first question from another thread. The logic here is the following - we need to handle the case when our "leak detector" fails. In this case, we can do one of the following: fail the test, pass the test, or pass with a warning. I chose the last option, but if we have full confidence in this leak detection approach, we may fail the test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we don't have full confidence yet in this approach and it might need refactoring. You might need to modify the build environment itself to ensure that the test is run via valgrind(1)
. You'd then need to add the valgrind
package to debian/control
in the Build-Dep
stanza as well as making sure the build environment finds it at configure time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kiplingw Probably there is a misunderstanding. When I said "leak detection approach", I meant not valgrind, but checking file descriptors count. Now I suggest removing anything about Valgrind in my test, and it will look like this:
TEST(http_server_test, http_server_is_not_leaked)
{
const auto number_of_fds = [] {
using filesystem::directory_iterator;
const filesystem::path fds_dir { "/proc/self/fd" };
if (!filesystem::exists(fds_dir))
{
return directory_iterator::difference_type(0);
}
return std::distance(directory_iterator(fds_dir), directory_iterator {});
};
const auto fds_before = number_of_fds();
ASSERT_GT(fds_before, 0);
// ...
const auto fds_after = number_of_fds();
ASSERT_EQ(fds_before, fds_after);
Valgrind was a fallback option in my original idea, but it has downsides.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dennisjenkins75, thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since you probably intend to remove the Valgrind fallback this isn't very relevant, but in general when skipping tests in GoogleTest you should use GTEST_SKIP()
(documented at https://google.github.io/googletest/advanced.html#skipping-test-execution) since it will properly signal to the testing environment that a test case was skipped.
tests/http_server_test.cc
Outdated
server->shutdown(); | ||
server.reset(); | ||
|
||
if (fds_before > 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Under what conditions would you expect
number_of_fds()
to return 0? gunit'smain()
does not close STDOUT nor STDERR, so there should always be at least two. - Why test for this condition AFTER running the HTTP server? Why not test for it on line 1069?
- Do you think that it is important to process an actual HTTP request within this test?
4
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- The answer is simple - only if
/proc/self/fd
directory doesn't exist. I know that the library supports only Linux now, but I wasn't sure if I could make this assumption for unit tests. Do you recommend removing this check? - There is no significant reason, just to make the unit test less noisy
- For this particular leak - it doesn't matter. Regarding leaks in general, I think it's important to test a realistic use case. So if you recommend adding it - I will
Suggestion:
|
And just so you know @tyler92, we're not trying to give you the runaround. Your PR in principal makes sense. We just want to make sure we don't make new problems. |
@kiplingw @dennisjenkins75 I've added new target tests/helpers with the function for getting a count of open file descriptors and tests. I'm not sure about the style for the copyright header, CMakeLists.txt, and meson.build, so please let me know if you don't like something (even minor things) |
I can't see any immediate issues that stand out, but interested to hear what @dennisjenkins75 says. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was no
close
call for the file descriptors created witheventfd
andtimerfd_create
. It led to a file descriptors leak. I'm really not sure if the fix may have a bad impact, but at least there is no leak now.