Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for Debian #67

Open
fuog opened this issue Dec 26, 2022 · 5 comments
Open

Support for Debian #67

fuog opened this issue Dec 26, 2022 · 5 comments

Comments

@fuog
Copy link

fuog commented Dec 26, 2022

Hi there,

i wanted to thank you for the nice ansible role. Unfortunately Debian does not seem to be officially supported. But I managed it with a bit of variable overriding.
I would be happy if debian would be officially supported. Until then, maybe this will help someone who uses debian to use this role anyway.

# my playbook
.....
  roles:
    - role: unix-basics
      tags: unix-basics
    - role: xanmanning.k3s
      tags: k3s
    - role: nvidia.nvidia_driver  # should run after cluster install
      vars:
        # See https://github.com/NVIDIA/ansible-role-nvidia-driver#role-variables
        nvidia_driver_ubuntu_cuda_repo_baseurl: 'https://developer.download.nvidia.com/compute/cuda/repos/debian11/x86_64'  # enforced 'debian11'
        nvidia_driver_ubuntu_install_from_cuda_repo: yes
        nvidia_driver_persistence_mode_on: yes
        ansible_distribution: Ubuntu  # forcing in to the ubuntu part of the role
      when: ansible_hostname == 'k3s-worker1'  # we only have ONE node with NVIDIA
      tags:
        - nvidia
....
@Uzurka
Copy link

Uzurka commented May 11, 2023

+1, debian support would be a great thing :D

@Uzurka
Copy link

Uzurka commented Jun 29, 2023

Hey Fuog,
I tried your "bypass" today and encounteered this error :

redirecting (type: modules) ansible.builtin.kernel_blacklist to community.general.kernel_blacklist
redirecting (type: modules) community.general.kernel_blacklist to community.general.system.kernel_blacklist
redirecting (type: modules) ansible.builtin.kernel_blacklist to community.general.kernel_blacklist
redirecting (type: modules) community.general.kernel_blacklist to community.general.system.kernel_blacklist
Using module file /home/ludo/.local/lib/python3.8/site-packages/ansible_collections/community/general/plugins/modules/system/kernel_blacklist.py
Pipelining is enabled.
<192.168.1.2> ESTABLISH SSH CONNECTION FOR USER: root
<192.168.1.2> SSH: EXEC ssh -C -o ControlMaster=auto -o ControlPersist=60s -o 'IdentityFile="/home/ludo/ssh/id_rsa"' -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o 'User="root"' -o ConnectTimeout=10 -o 'ControlPath="/home/ludo/.ansible/cp/f1b6b591d3"' 192.168.1.2 '/bin/sh -c '"'"'/usr/bin/python3 && sleep 0'"'"''
<192.168.1.2> (1, b'\n{"path": "/tmp/tmpla68agn1", "details": "Error while setting attributes: /tmp/tmpla68agn1: Operation not supported\\n", "exception": "Traceback (most recent call last):\\n  File \\"/tmp/ansible_kernel_blacklist_payload_lwuic3yg/ansible_kernel_blacklist_payload.zip/ansible/module_utils/basic.py\\", line 1003, in set_attributes_if_different\\n    raise Exception(\\"Error while setting attributes: %s\\" % (out + err))\\nException: Error while setting attributes: /tmp/tmpla68agn1: Operation not supported\\n\\n", "failed": true, "msg": "chattr failed", "uid": 0, "gid": 0, "owner": "root", "group": "root", "mode": "0644", "state": "file", "size": 0, "invocation": {"module_args": {"name": "nouveau", "state": "present", "blacklist_file": "/etc/modprobe.d/blacklist-ansible.conf"}}}\n', b'')
<192.168.1.2> Failed to connect to the host via ssh: 
The full traceback is:
Traceback (most recent call last):
  File "/tmp/ansible_kernel_blacklist_payload_lwuic3yg/ansible_kernel_blacklist_payload.zip/ansible/module_utils/basic.py", line 1003, in set_attributes_if_different
    raise Exception("Error while setting attributes: %s" % (out + err))
Exception: Error while setting attributes: /tmp/tmpla68agn1: Operation not supported

fatal: [openmediavault]: FAILED! => {
    "changed": false,
    "details": "Error while setting attributes: /tmp/tmpla68agn1: Operation not supported\n",
    "gid": 0,
    "group": "root",
    "invocation": {
        "module_args": {
            "blacklist_file": "/etc/modprobe.d/blacklist-ansible.conf",
            "name": "nouveau",
            "state": "present"
        }
    },
    "mode": "0644",
    "msg": "chattr failed",
    "owner": "root",
    "path": "/tmp/tmpla68agn1",
    "size": 0,
    "state": "file",
    "uid": 0
}```

Any idea of why ? 

@Zorlin
Copy link

Zorlin commented Aug 1, 2023

Hey Fuog, I tried your "bypass" today and encounteered this error :

Any idea of why ?

Try installing the acl package.

@Uzurka
Copy link

Uzurka commented Aug 1, 2023

I surrendered using this role as long as Nvidia don't update it to support Debian, which is one of the most used distrib for server. Anyway, i installed & update my driver using those tasks :

    - name: Add contrib & non-free repository
      replace:
        dest: /etc/apt/sources.list
        regexp: '^(deb(?!.* contrib).*)'
        replace: '\1 contrib non-free'
      notify: Apt cache update
      tags: nvidia


    - name: Installer les pilotes Nvidia
      apt:
        name: nvidia-driver
        autoremove: false
        dpkg_options: 'force-confnew'
      environment:
        DEBIAN_FRONTEND: noninteractive
      tags: nvidia

    - name: Installation des dépendances
      apt:
        update_cache: true
        name:
          - gnupg
          - build-essential
          - dirmngr
          - mariadb-server
          - docker-compose
          - docker-compose-plugin
          - python3-pymysql
          - nvidia-smi
          - nvidia-container-toolkit
          - nvidia-container-runtime
          - nvidia-docker2
      tags: dependancies 

The dependancies contains quite all i need for my server, including so nvidia-docker and nvidia-container packages
Everything works fine with it

@jpellman
Copy link

jpellman commented Aug 18, 2023

Just as a caveat for those trying to use @fuog 's solution:

While install-redhat.yml has a task to install Linux kernel headers (see here), install-ubuntu.yml does not have an equivalent task. If you want the NVIDIA driver to install/compile properly under Debian, you will also need to install linux-headers-{{ ansible_kernel }} in a separate task somewhere.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants