Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem running RStudio server with podman #838

Open
bernt-matthias opened this issue Aug 5, 2024 · 14 comments
Open

Problem running RStudio server with podman #838

bernt-matthias opened this issue Aug 5, 2024 · 14 comments
Labels
help wanted Extra attention is needed needs more info Further information is requested question

Comments

@bernt-matthias
Copy link

I'm trying to get rocker running with podman (our HPC does only support podman).

podman run     --rm     -ti     -p 8787:8787     -e DISABLE_AUTH=true     rocker/rstudio:latest-daily

This gives me the following output:

s6-supervise s6-fdholderd: fatal: unable to mkfifodir event: Value too large for data type
s6-supervise s6-fdholderd: fatal: unable to mkfifodir event: Permission denied
s6-supervise s6-fdholderd: fatal: unable to mkfifodir event: Permission denied
s6-supervise s6-fdholderd: fatal: unable to mkfifodir event: Permission denied
s6-supervise s6-fdholderd: fatal: unable to mkfifodir event: Permission denied

Any ideas that help me understand / solve the issue?

As far as I see (curl http://localhost:8787/ gives me curl: (56) Recv failure: Connection reset by peer) the RStudio server is not up.

@eitsupi eitsupi transferred this issue from rocker-org/rocker Aug 5, 2024
@cboettig
Copy link
Member

cboettig commented Aug 5, 2024

can you try with rocker/rstudio:latest instead? (unfortunately the daily tag is not successfully building at this time, cc @eitsupi )

@bernt-matthias
Copy link
Author

Thanks for the fast reply. Same problem with rocker/rstudio:latest (stumbled over this problem when I tried to get this running .. so it was unlikely that latest works).

@eitsupi
Copy link
Member

eitsupi commented Aug 6, 2024

unfortunately the daily tag is not successfully building at this time

The latest-daily tag is obsolete.

@cboettig
Copy link
Member

cboettig commented Aug 6, 2024

@bernt-matthias looks like you may be running in an environment where root is blocked? Can you test rocker/binder instead?

podman run --rm -p 8888:8888 docker.io/rocker/binder

(apologies for all the suggestions, shooting a bit in the dark here. binder runs without root via jupyterhub and may thus sidestep the issue).

@eitsupi eitsupi changed the title Problem running rocker (RStudio server) with podman Problem running RStudio server with podman Aug 7, 2024
@eitsupi eitsupi added help wanted Extra attention is needed needs more info Further information is requested labels Aug 7, 2024
@bernt-matthias
Copy link
Author

@bernt-matthias looks like you may be running in an environment where root is blocked?

I would guess so. Given that our HPC system (or its admins) is quiet restrictive.

Can you test rocker/binder instead?

podman run --rm -p 8888:8888 docker.io/rocker/binder

This gives me.

Writing manifest to image destination
WARN[0411] Additional gid=50 is not present in the user namespace, skip setting it 
Error: OCI runtime error: crun: cannot setresgid to `1000`: Invalid argument

So I guess that means that you are right?

(apologies for all the suggestions, shooting a bit in the dark here. binder runs without root via jupyterhub and may thus sidestep the issue).

Nothing to worry about. Anything is highly appreciated and I'm happy to try any suggestion.

@benz0li
Copy link
Contributor

benz0li commented Aug 7, 2024

@bernt-matthias Does podman run -ti --rm debian work?

You could give glcr.b-data.ch/jupyterlab/r/geospatial a try, i.e.

podman run --rm \
  -p 8888:8888 \
  glcr.b-data.ch/jupyterlab/r/geospatial

or

podman run --rm \
  -p 8888:8888 \
  -u root \
  -e NB_USER=root \
  -e NB_UID=0 \
  -e NB_GID=0 \
  -e NOTEBOOK_ARGS="--allow-root" \
  glcr.b-data.ch/jupyterlab/r/geospatial

@bernt-matthias
Copy link
Author

Does podman run -ti --rm debian work?

Yes.

You could give glcr.b-data.ch/jupyterlab/r/geospatial a try, i.e.

podman run --rm \
  -p 8888:8888 \
  glcr.b-data.ch/jupyterlab/r/geospatial

This just gives me: Error: OCI runtime error: crun: cannot setresgid to 100: Invalid argument

podman run --rm \
  -p 8888:8888 \
  -u root \
  -e NB_USER=root \
  -e NB_UID=0 \
  -e NB_GID=0 \
  -e NOTEBOOK_ARGS="--allow-root" \
  glcr.b-data.ch/jupyterlab/r/geospatial

This seems to do much more:

Entered start.sh with args: start-notebook.sh
Running hooks in: /usr/local/bin/start-notebook.d as uid: 0 gid: 0
Sourcing shell script: /usr/local/bin/start-notebook.d/10-populate.sh
Done running hooks in: /usr/local/bin/start-notebook.d
Updated the jovyan user:
- username: jovyan       -> root
- home dir: /home/jovyan -> /home/root
Attempting to copy /home/jovyan to /home/root...
Success!
Changing working directory to /home/root/
Running hooks in: /usr/local/bin/before-notebook.d as uid: 0 gid: 0
Sourcing shell script: /usr/local/bin/before-notebook.d/10-env.sh
Sourcing shell script: /usr/local/bin/before-notebook.d/11-home.sh

But there is an error in the "end": runuser: cannot set groups: Operation not permitted

@nathanweeks
Copy link

I'd suspect a podman configuration issue; the output of podman info might provide clues.

Both podman run --rm -p 8888:8888 docker.io/rocker/binder and podman run --rm -p 8787:8787 -e DISABLE_AUTH=true docker.io/rocker/rstudio:latest worked on the HPC cluster I tested them on.

@bernt-matthias
Copy link
Author

Thanks for the feedback @nathanweeks. Good news is that it does work in principle :)

Here is my podman info. Maybe you check diff and if there is a critical difference

host:
  arch: amd64
  buildahVersion: 1.37.5
  cgroupControllers:
  - cpuset
  - cpu
  - memory
  - pids
  cgroupManager: systemd
  cgroupVersion: v2
  conmon:
    package: conmon-2.1.10-1.el9.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.1.10, commit: 3ea3d7f99779af0fcd69ec16c211a7dc3b4efb60'
  cpuUtilization:
    idlePercent: 96.88
    systemPercent: 0.57
    userPercent: 2.56
  cpus: 56
  databaseBackend: sqlite
  distribution:
    distribution: rocky
    version: "9.4"
  eventLogger: file
  freeLocks: 2046
  hostname: bioinf3
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 4533
      size: 1
    uidmap:
    - container_id: 0
      host_id: 61715
      size: 1
  kernel: 5.14.0-427.13.1.el9_4.x86_64
  linkmode: dynamic
  logDriver: k8s-file
  memFree: 1597776486400
  memTotal: 1622256828416
  networkBackend: netavark
  networkBackendInfo:
    backend: netavark
    dns:
      package: aardvark-dns-1.10.0-3.el9_4.x86_64
      path: /usr/libexec/podman/aardvark-dns
      version: aardvark-dns 1.10.0
    package: netavark-1.10.3-1.el9.x86_64
    path: /usr/libexec/podman/netavark
    version: netavark 1.10.3
  ociRuntime:
    name: crun
    package: crun-1.14.3-1.el9.x86_64
    path: /usr/bin/crun
    version: |-
      crun version 1.14.3
      commit: 1961d211ba98f532ea52d2e80f4c20359f241a98
      rundir: /run/user/61715/crun
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +YAJL
  os: linux
  pasta:
    executable: /usr/bin/pasta
    package: passt-0^20240806.gee36266-2.el9.x86_64
    version: |
      pasta 0^20240806.gee36266-2.el9.x86_64
      Copyright Red Hat
      GNU General Public License, version 2 or later
        <https://www.gnu.org/licenses/old-licenses/gpl-2.0.html>
      This is free software: you are free to change and redistribute it.
      There is NO WARRANTY, to the extent permitted by law.
  remoteSocket:
    exists: true
    path: /run/user/61715/podman/podman.sock
  rootlessNetworkCmd: pasta
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: true
    seccompEnabled: true
    seccompProfilePath: /usr/share/containers/seccomp.json
    selinuxEnabled: false
  serviceIsRemote: false
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: slirp4netns-1.2.3-1.el9.x86_64
    version: |-
      slirp4netns version 1.2.3
      commit: c22fde291bb35b354e6ca44d13be181c76a0a432
      libslirp: 4.4.0
      SLIRP_CONFIG_VERSION_MAX: 3
      libseccomp: 2.5.2
  swapFree: 0
  swapTotal: 0
  uptime: 406h 19m 48.00s (Approximately 16.92 days)
  variant: ""
plugins:
  authorization: null
  log:
  - k8s-file
  - none
  - passthrough
  - journald
  network:
  - bridge
  - macvlan
  - ipvlan
  volume:
  - local
registries:
  search:
  - registry.access.redhat.com
  - registry.redhat.io
  - docker.io
store:
  configFile: /gpfs1/schlecker/home/songalax/.config/containers/storage.conf
  containerStore:
    number: 1
    paused: 0
    running: 0
    stopped: 1
  graphDriverName: overlay
  graphOptions:
    overlay.force_mask: "700"
    overlay.mount_program:
      Executable: /usr/bin/fuse-overlayfs
      Package: fuse-overlayfs-1.13-1.el9.x86_64
      Version: |-
        fusermount3 version: 3.10.2
        fuse-overlayfs: version 1.13-dev
        FUSE library version 3.10.2
        using FUSE kernel interface version 7.31
  graphRoot: /home/songalax/.local/share/containers/storage
  graphRootAllocated: 32985348833280
  graphRootUsed: 19623126761472
  graphStatus:
    Backing Filesystem: gpfs
    Native Overlay Diff: "false"
    Supports d_type: "true"
    Supports shifting: "true"
    Supports volatile: "true"
    Using metacopy: "false"
  imageCopyTmpDir: /var/tmp
  imageStore:
    number: 7
  runRoot: /tmp/containers-user-61715/containers
  transientStore: false
  volumePath: /gpfs1/schlecker/home/songalax/.local/share/containers/storage/volumes
version:
  APIVersion: 5.2.2
  Built: 1731414899
  BuiltTime: Tue Nov 12 13:34:59 2024
  GitCommit: ""
  GoVersion: go1.22.7 (Red Hat 1.22.7-2.el9_5)
  Os: linux
  OsArch: linux/amd64
  Version: 5.2.2

podman run --rm -p 8888:8888 docker.io/rocker/binder gives me:

WARN[0632] Additional gid=50 is not present in the user namespace, skip setting it 
Error: OCI runtime error: crun: cannot setresgid to `1000`: Invalid argument

@bernt-matthias
Copy link
Author

podman run  --rm  -p 8787:8787  -e DISABLE_AUTH=true  docker.io/rocker/rstudio:latest
....
Writing manifest to image destination
s6-supervise s6-fdholderd: fatal: unable to mkfifodir event: Value too large for data type
s6-supervise s6-fdholderd: fatal: unable to mkfifodir event: Permission denied
...

@nathanweeks
Copy link

graphRoot: /home/songalax/.local/share/containers/storage

I assume that path on an NFS (or Lustre, GPFS, or other networked) file system? Note it must be on node-local storage (see man storage.conf for documentation, specifically in the rootless_storage_path description, as well as this blog post). In that case, you can try editing your storage.conf file (/gpfs1/schlecker/home/songalax/.config/containers/storage.conf) and change the following settings (for consistency with your runroot setting):

[storage]
graphroot="/tmp/containers-user-$UID/storage"
rootless_storage_path="/tmp/containers-user-$UID/storage"

Then issue the command podman system reset (note the warning about the container objects that will be removed) and try the podman run commands again.

@bernt-matthias
Copy link
Author

This changed something. Now I get something like the following for both:

Error: copying system image from manifest list: writing blob: adding layer with blob "sha256:6414378b647780fee8fd903ddb9541d134a1947ce092d08bdeb23a54cb3684ac"/""/"sha256:2573e0d8158209ed54ab25c87bcdcb00bd3d2539246960a3d592a1c599d70465": creating read-only layer with ID "2573e0d8158209ed54ab25c87bcdcb00bd3d2539246960a3d592a1c599d70465": lsetxattr /tmp/containers-user-61715/storage/overlay/2573e0d8158209ed54ab25c87bcdcb00bd3d2539246960a3d592a1c599d70465/diff: operation not supported

/tmp is a tmpfs. Our kernel is at 5.14.0-427.13.1.el9_4.x86_64

@bernt-matthias
Copy link
Author

Might be related to our kernel version: moby/moby#47962 ..
If this is the case, could another storage driver help?

@nathanweeks
Copy link

Might be related to our kernel version: moby/moby#47962 .. If this is the case, could another storage driver help?

You could try the "vfs" storage driver---though note the disadvantages of vfs compared to "overlay" with respect to storage overhead; this would seem to be a particularly-inefficient use of a memory-backed tmpfs.

Alternatively, do the compute nodes on your institution's HPC cluster have a node-local secondary/disk storage filesystem path (with a non-tmpfs filesystem, e.g., ext4, XFS, ZFS) that is user-writable? If so, you could try changing graphroot & rootless_storage_path to point to a path on that filesystem that can be created/written to. (The df -hl command can help identify local filesystem mount paths)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed needs more info Further information is requested question
Projects
None yet
Development

No branches or pull requests

5 participants