You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am trying the AMD device plugin on my system, deployed as Systemd unit on Debian 11 (so not a DaemonSet, but directly on the K8s node). Everything works fine and I am able to see two devices in my test container:
/dev/kfd
/dev/dri/renderD128
I am trying to run the container with an unpriviledged user, like nobody, but I am struggling to assign the proper permissions to the above devices. In the container I see something like the following (tested via nsenter):
The gid 106 is the render group on the underlying "bare metal" K8s worker OS, that gets mapped to the test container, but in this way I don't have a clear way to add nobody to render or similar (in the Docker image). Is there a best practice that you can suggest?
Thanks in advance!
The text was updated successfully, but these errors were encountered:
wmfgerrit
pushed a commit
to wikimedia/operations-puppet
that referenced
this issue
Apr 20, 2023
On k8s nodes we need to be able to bypass the restriction
on GPU related devices (/dev/kfd, /dev/dri/renderXXXX) set
for root:render, see
ROCm/k8s-device-plugin#39
We don't need anymore to vary the kfd access policies, so it seems
good to transform the option into something more flexible for
a broader range of use cases.
Bug: T333009
Change-Id: Idab004a1a725b1223d4ee36d2d0d900c329140f9
Hi folks!
I am trying the AMD device plugin on my system, deployed as Systemd unit on Debian 11 (so not a DaemonSet, but directly on the K8s node). Everything works fine and I am able to see two devices in my test container:
I am trying to run the container with an unpriviledged user, like
nobody
, but I am struggling to assign the proper permissions to the above devices. In the container I see something like the following (tested viansenter
):The gid 106 is the
render
group on the underlying "bare metal" K8s worker OS, that gets mapped to the test container, but in this way I don't have a clear way to addnobody
torender
or similar (in the Docker image). Is there a best practice that you can suggest?Thanks in advance!
The text was updated successfully, but these errors were encountered: