Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dkms install fails #115

Open
remopini opened this issue Feb 2, 2023 · 3 comments
Open

dkms install fails #115

remopini opened this issue Feb 2, 2023 · 3 comments

Comments

@remopini
Copy link

remopini commented Feb 2, 2023

I have an issue when trying to install the kernel module:

root@proxmox01:~# dkms install -m nvidia -v 525.85.07
Kernel preparation unnecessary for this kernel. Skipping...
Building module:
cleaning build area...
'make' -j24 NV_EXCLUDE_BUILD_MODULES='' KERNEL_UNAME=5.15.83-1-pve modules......(bad exit status: 2)
Error! Bad return status for module build on kernel: 5.15.83-1-pve (x86_64)
Consult /var/lib/dkms/nvidia/525.85.07/build/make.log for more information.

If I look at the log referenced above, I see a lot of errors in the form of:

...
/root/vgpu_unlock/vgpu_unlock_hooks.c:589:4: note: (near initialization for ‘vgpu_unlock_vgpu[155].num_blocks’)
589 | { (10 + 2 * strlen(name) + 15) / 16, /* num_blocks */ \
| ^
/root/vgpu_unlock/vgpu_unlock_hooks.c:770:2: note: in expansion of macro ‘VGPU’
770 | VGPU(0x2230, 0x151a, "NVIDIA RTXA6000-48C"),
| ^~~~
/root/vgpu_unlock/vgpu_unlock_hooks.c:590:4: error: initializer element is not constant
590 | strlen(name), /* name1_len */ \
| ^~~~~~
/root/vgpu_unlock/vgpu_unlock_hooks.c:770:2: note: in expansion of macro ‘VGPU’
770 | VGPU(0x2230, 0x151a, "NVIDIA RTXA6000-48C"),
| ^~~~
...

Is this caused by anything I screwed up?

@eebrains
Copy link

The issue is due to the library call of 'strlen' in the helper macro.
I was able to fix it locally by adding a 'len' field in the macro for the string length, then replace strlen(name) with 'len' within the helper macro. Then manually editing each entry with the proper length value.

So basically the VGU macro looks like this:

#define VGPU(dev_id, subsys_id, name, len) \
        { (10 + 2 * len + 15) / 16,          /* num_blocks */     \
          len,                               /* name1_len */      \
          len,                               /* name2_len */      \
          (dev_id),                          /* dev_id */         \
          0,                                 /* vend_id */        \
          (subsys_id),                       /* subsys_id */      \
          0,                                 /* subsys_vend_id */ \
          { name name } }                    /* name1_2 */

and the initializer looks like this:

static vgpu_unlock_vgpu_t vgpu_unlock_vgpu[] =
{
        /* Tesla M10 (Maxwell) */
        VGPU(0x13bd, 0x11cc, "GRID M10-0B",11),
        VGPU(0x13bd, 0x11cd, "GRID M10-1B",11),
        VGPU(0x13bd, 0x1339, "GRID M10-1B4",12),
        VGPU(0x13bd, 0x1286, "GRID M10-2B",11),
        VGPU(0x13bd, 0x12ee, "GRID M10-2B4",12),
...

That initializer is pretty long, you have to do every entry. It took a while to edit every entry it in nano... but it worked :)

@jforman96
Copy link

Hello, I tried to make it work according to your instructions, but I encounter another error. Do you know where the problem could be?

In file included from /var/lib/dkms/nvidia/525.105.14/build/nvidia/os-interface.c:25:
/root/vgpu_unlock/vgpu_unlock_hooks.c:790:17: warning: ‘vgpu_unlock_bar3_end’ defined but not used [-Wunused-variable]
  790 | static uint64_t vgpu_unlock_bar3_end;
      |                 ^~~~~~~~~~~~~~~~~~~~
/root/vgpu_unlock/vgpu_unlock_hooks.c:789:17: warning: ‘vgpu_unlock_bar3_beg’ defined but not used [-Wunused-variable]
  789 | static uint64_t vgpu_unlock_bar3_beg;
      |                 ^~~~~~~~~~~~~~~~~~~~
/root/vgpu_unlock/vgpu_unlock_hooks.c:788:13: warning: ‘vgpu_unlock_bar3_mapped’ defined but not used [-Wunused-variable]
  788 | static bool vgpu_unlock_bar3_mapped = FALSE;
      |             ^~~~~~~~~~~~~~~~~~~~~~~
cc1: some warnings being treated as errors
make[2]: *** [scripts/Makefile.build:297: /var/lib/dkms/nvidia/525.105.14/build/nvidia/os-interface.o] Error 1
make[2]: *** Waiting for unfinished jobs....
make[1]: *** [Makefile:1909: /var/lib/dkms/nvidia/525.105.14/build] Error 2
make[1]: Leaving directory '/usr/src/linux-headers-5.15.107-2-pve'
make: *** [Makefile:82: modules] Error 2

Log file:
make.log

@ksqeib
Copy link

ksqeib commented Aug 26, 2023

The issue is due to the library call of 'strlen' in the helper macro.
I was able to fix it locally by replacing strlen(name) with sizeof(name) -1 within the helper macro. Then manually editing each entry with the proper length value.

So basically the VGU macro looks like this:

#define VGPU(dev_id, subsys_id, name) \
	{ (10 + 2 * (sizeof(name) - 1) + 15) / 16, /* num_blocks */     \
	  sizeof(name) - 1,                      /* name1_len */      \
	  sizeof(name) - 1,                      /* name2_len */      \
	  (dev_id),                          /* dev_id */         \
	  0,                                 /* vend_id */        \
	  (subsys_id),                       /* subsys_id */      \
	  0,                                 /* subsys_vend_id */ \
	  { name name } }                    /* name1_2 */

It worked :)

My English is poor. So I copied the answer of @eebrains . Thanks for his answer template.XD

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants