-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simplify stats polling logic #7987
base: master
Are you sure you want to change the base?
Conversation
fd178bc
to
667f2b8
Compare
7ea7fc2
to
afeeaae
Compare
afeeaae
to
bb8f060
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is the TODO on line 126 completed now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed this TODO for now - if we wanted to remove this "baseline" tracking logic, but still be able to properly track usage for remote persistent workers, then we'd have to migrate all of the container processes to a new cgroup for each task executed, which seems a bit too complicated. We can maybe revisit later.
@@ -392,7 +298,7 @@ func TrackStats(ctx context.Context, c CommandContainer) (stop func(), res <-cha | |||
// the container, which can take a few hundred ms or possibly longer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the mention of podman on the line above might be out of date.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That part is still accurate, since podman doesn't (yet) have the logic where we create the cgroup then run the container in that cgroup that we created. Instead, podman manages the cgroup lifecycle itself.
I removed this note for now though, since it's not super important. The bigger TODO is to make sure that the container implementations don't delete the cgroup from under us (this is a problem with both ociruntime and podman). That way, we could remove this polling, and reliably read the cgroup stats once at the end of execution.
This PR cleans up some stats logic in preparation for tracking firecracker stats using cgroups, instead of tracking stats using the guest vmexec server. The current logic is somewhat tangled up which would make the integration with firecracker a bit more messy than it needs to be.
c.Stats()
. This logic is complex and its value is pretty questionable.executor.child_cgroups_enabled
is rolled out.TrackStats
utility function, which would start a loop that calls back to the container'sStats()
function, which was expected to callusageStats.Update()
then returnusageStats.TaskStats()
.TrackStats
is a method of theUsageStats
struct, and manages these careful state updates internally, without a circular call back to thecontainer.Stats()
function, and without expecting the caller to callusageStats.Reset()
andusageStats.TaskStats()