Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix mem usage reporting when using docker limits #5011

Merged
merged 3 commits into from
Jan 11, 2023
Merged

Fix mem usage reporting when using docker limits #5011

merged 3 commits into from
Jan 11, 2023

Conversation

leccelecce
Copy link
Contributor

@leccelecce leccelecce commented Jan 10, 2023

Fixes #5004

This identifies if we are running in a container with a memory limit set. If so, rather than taking the %MEM from top directly, it calculates it as (resident memory / docker limit).

It seems the base container is different to the production builds - the path changes slightly. I've handled both cases for now.

[edit]
After some local testing, my expectation is that this will work with Docker engine 20.10 (released December 2020) on suitably modern hosts, but not WSL2 on Windows without extra configuraiton.

@netlify
Copy link

netlify bot commented Jan 10, 2023

Deploy Preview for frigate-docs canceled.

Name Link
🔨 Latest commit 0f10831
🔍 Latest deploy log https://app.netlify.com/sites/frigate-docs/deploys/63bf44acfa80da00081b3699

frigate/util.py Outdated
Comment on lines 761 to 779
# devcontainer seems to use this path
memlimit_command = ["cat", "/sys/fs/cgroup/memory/memory.max_usage_in_bytes"]

p = sp.run(
memlimit_command,
encoding="ascii",
capture_output=True,
)

if p.returncode != 0:
logger.debug(f"Unable to get docker memlimit: {p.stderr}")
return -1
else:
value: str = p.stdout

if value.strip().isnumeric():
return int(value)
elif value.strip().lower() == "max":
return -1
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it makes sense to handle this in the dev container. I don't expect our config to be setting memory limits and in any case I don't really see the benefit

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, apart from anything else it's far easier to test there. I still haven't got a "proper" docker build working locally, so had to get the path by attaching to a container and looking up the path manually.

I also tend to avoid different code in dev/release environments as far as possible to catch bugs earlier.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way I see it having code to specifically handle the dev container is different code. There's also already a number of differences in the way the dev container runs vs the release container as the dev container doesn't use S6, frigate is started manually, etc. so to me that is moot.

Just my 2 cents happy to hear what others think

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you give any pointers to producing a release docker image locally? I tried docker build . -t mybuild but it just produces another dev container from what I can see

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you give any pointers to producing a release docker image locally? I tried docker build . -t mybuild but it just produces another dev container from what I can see

make build

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you give any pointers to producing a release docker image locally? I tried docker build . -t mybuild but it just produces another dev container from what I can see

make build

Thanks! I'm used to Java tooling in Eclipse/maven so VS Code, make, docker and devcontainers have been a bit of a learning curve

@leccelecce leccelecce marked this pull request as draft January 11, 2023 01:02
@leccelecce
Copy link
Contributor Author

leccelecce commented Jan 11, 2023

It turns out this isn't so much a case of devcontainer vs release build, but dependent on the version of Docker engine in use. The /sys/fs/cgroup/ paths changed from Linux cgroups to cgroups v2, which was introduced in Docker v20.10. This explains why I'm seeing different results across dev containers, my local build, and the release builds. Debian Bullseye is only just on 20.10 so slightly older distros may not have the latest engine. Docker Desktop is more likely to be up to date.

This should just be a case of handling both paths, but I'll rework this to do it in a more deliberate manner with an explicit check on which cgroups version we expect. The cgroups v1 path also needs correcting to use a different metric.

frigate/util.py Outdated Show resolved Hide resolved
frigate/util.py Outdated Show resolved Hide resolved
@leccelecce
Copy link
Contributor Author

So, it seems the official release builds of frigate are using a sufficiently new distro/docker version to default to cgroups v2. However my local devcontainer builds (build on Windows using WSL2) only show cgroups v1 enabled, which apparently may change in future but is not there yet (see microsoft/WSL#6662). I don't have other dev envs to test on so not clear what they do.

In theory both cgroups1 and 2 support checking memory limits - but on cgroups1 it appears to be problematic in some containers (just reporting 9223372036854771712 for example). On that basis, proceeding with a solution for cgroups v2 only.

Docker does support updates to memory limits on running containers, and the command to check memory is lightweight so running it each time should not impact stats load times.

@leccelecce leccelecce marked this pull request as ready for review January 11, 2023 23:23
@leccelecce leccelecce requested a review from NickM-27 January 11, 2023 23:26
@blakeblackshear blakeblackshear merged commit ddcae2d into blakeblackshear:dev Jan 11, 2023
@leccelecce leccelecce deleted the system_top_cgroups branch January 11, 2023 23:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants