Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: Shared memory not working, results in Segfault #611

Open
abishekmuthian opened this issue Nov 7, 2024 · 9 comments
Open

Bug: Shared memory not working, results in Segfault #611

abishekmuthian opened this issue Nov 7, 2024 · 9 comments

Comments

@abishekmuthian
Copy link

Contact Details

[email protected]

What happened?

Thank you Justine and team for the llamafile.

I have 16GB VRAM and 96GB RAM in my system (Fedora 41).

When I run gemma-2-27b-it.Q6_K.llamafile with -ngl 1 I get Segmentation Fault.

The model works fine when I don't use GPU offloading. I use the same model in Ollama all the time, where the VRAM and RAM are shared resulting in better performance. I'm told llama.cpp uses system ram when we try to run model that's more than Vram, Doesn't llamafile do the same?

Version

llamafile v0.8.15

What operating system are you seeing the problem on?

Linux

Relevant log output

[abishek@MacubexROGLinux llamafile]$ ./gemma-2-27b-it.Q6_K.llamafile -ngl 1

██╗     ██╗      █████╗ ███╗   ███╗ █████╗ ███████╗██╗██╗     ███████╗
██║     ██║     ██╔══██╗████╗ ████║██╔══██╗██╔════╝██║██║     ██╔════╝
██║     ██║     ███████║██╔████╔██║███████║█████╗  ██║██║     █████╗
██║     ██║     ██╔══██║██║╚██╔╝██║██╔══██║██╔══╝  ██║██║     ██╔══╝
███████╗███████╗██║  ██║██║ ╚═╝ ██║██║  ██║██║     ██║███████╗███████╗
╚══════╝╚══════╝╚═╝  ╚═╝╚═╝     ╚═╝╚═╝  ╚═╝╚═╝     ╚═╝╚══════╝╚══════╝
 launching server...
error: Uncaught SIGSEGV (SEGV_MAPERR) at 0x8 on MacubexROGLinux pid 157694 tid 157701
  ./gemma-2-27b-it.Q6_K.llamafile
  No error information
  Linux Cosmopolitan 3.9.4 MODE=x86_64; #1 SMP PREEMPT_DYNAMIC Tue Oct 22 20:11:15 UTC 2024 MacubexROGLinux 6.11.5-300.fc41.x86_64

RAX 0000000000000000 RBX 00007f645c196240 RDI 0000000000000000
RCX 0000000000000003 RDX 0000000000000000 RSI 00007f69995fec00
RBP 00007f645c1958a0 RSP 00007f645c1958a0 RIP 0000000000545096
 R8 0000000000000002  R9 00007f69995fd138 R10 00007f69995fec00
R11 0000000000000070 R12 00007f645c197540 R13 00007f645c197210
R14 00007f645c196258 R15 00007f645c1958b0
TLS 00007f5f10de4b00

XMM0  00007f69995fef9000007f69995fef90 XMM8  00000000000000000000000000000000
XMM1  00000000000000000000000000000000 XMM9  00000000000000000000000000000000
XMM2  00007f644880000000007f5f0f221000 XMM10 00000000000000000000000000000000
XMM3  00007f645400000000007f6450000000 XMM11 00000000000000000000000000000000
XMM4  00007ffd816f400000007f69996e4000 XMM12 00000000000000000000000000000000
XMM5  00007f645c1a800000007f645c183000 XMM13 00000000000000000000000000000000
XMM6  203d2032585641207c2031203d20494e XMM14 00000000000000000000000000000000
XMM7  4e565f585641207c2031203d20585641 XMM15 00000000000000000000000000000000

cosmoaddr2line /home/abishek/llamafile/gemma-2-27b-it.Q6_K.llamafile 545096 43390a 42d23a 4b77a5 8cd994 8ddf94 9359e7

0x0000000000545096: ?? ??:0
0x000000000043390a: ?? ??:0
0x000000000042d23a: ?? ??:0
0x00000000004b77a5: ?? ??:0
0x00000000008cd994: ?? ??:0
0x00000000008ddf94: ?? ??:0
0x00000000009359e7: ?? ??:0

000000400000-000000a801e0 r-x-- 6656kb
000000a81000-0000031dd000 rw--- 39mb
0006fe000000-0006fe001000 rw-pa 4096b
7f5f0de00000-7f5f0e000000 rw-pa 2048kb
7f5f10800000-7f5f14800000 rw-pa 64mb
7f5f149e0000-7f6448651e80 r--s- 21gb
7f645be00000-7f645c000000 rw-pa 2048kb
7f645c184000-7f645c185000 ---pa 4096b
7f645c185000-7f645c198000 rw-pa 76kb
7f645c1f2000-7f699947389b r--s- 21gb
7f6999475000-7f6999475fc0 rw-pa 4032b
7f6999486000-7f69995ac418 rw-pa 1177kb
7f69995ad000-7f69996de000 rw-pa 1220kb
7ffd80f16000-7ffd81716000 rw--- 8192kb
# 44'974'731'264 bytes in 14 mappings


./gemma-2-27b-it.Q6_K.llamafile -m gemma-2-27b-it.Q6_K.gguf -c 8192 -ngl 1 
Segmentation fault (core dumped)
@OEvgeny
Copy link

OEvgeny commented Nov 17, 2024

Same for me on NixOS:

> ./llamafile-0.8.16 -m gemma-2-27b-it-Q6_K_L.gguf -ngl 10
██╗     ██╗      █████╗ ███╗   ███╗ █████╗ ███████╗██╗██╗     ███████╗
██║     ██║     ██╔══██╗████╗ ████║██╔══██╗██╔════╝██║██║     ██╔════╝
██║     ██║     ███████║██╔████╔██║███████║█████╗  ██║██║     █████╗
██║     ██║     ██╔══██║██║╚██╔╝██║██╔══██║██╔══╝  ██║██║     ██╔══╝
███████╗███████╗██║  ██║██║ ╚═╝ ██║██║  ██║██║     ██║███████╗███████╗
╚══════╝╚══════╝╚═╝  ╚═╝╚═╝     ╚═╝╚═╝  ╚═╝╚═╝     ╚═╝╚══════╝╚══════╝
 launching server...
error: Uncaught SIGSEGV (SEGV_MAPERR) at 0x8 on homelab pid 3750201 tid 3750215
  ./llamafile-0.8.16
  No error information
  Linux Cosmopolitan 3.9.6 MODE=x86_64; #1-NixOS SMP PREEMPT_DYNAMIC Wed Sep 18 17:24:10 UTC 2024 homelab 6.6.52

RAX 0000000000000000 RBX 00007f9c6622e240 RDI 0000000000000000
RCX 0000000000000003 RDX 0000000000000000 RSI 00007f9d4add2c00
RBP 00007f9c6622d8a0 RSP 00007f9c6622d8a0 RIP 0000000000545066
 R8 0000000000000002  R9 00007f9d4add10d8 R10 00007f9d4add2c00
R11 0000000000000040 R12 00007f9c6622f540 R13 00007f9c6622f210
R14 00007f9c6622e258 R15 00007f9c6622d8b0
TLS 00007f9b1c3fad00

XMM0  00007f9d4add2f9000007f9d4add2f90 XMM8  00000000a33cd03000000000a33cc6b0
XMM1  00000000000000000000000000000000 XMM9  00000000000000000000000000000000
XMM2  00000000000000000000000000000000 XMM10 00000000000000000000000000000000
XMM3  00007f9cb022ba500000002c04000000 XMM11 00000000000000000000000000000000
XMM4  000000000000000000007f9d41563b20 XMM12 00000000000000000000000000000000
XMM5  7c2031203d20323135585641207c2031 XMM13 00000000000000000000000000000000
XMM6  203d2032585641207c2030203d20494e XMM14 00000000000000000000000000000000
XMM7  4e565f585641207c2031203d20585641 XMM15 00000000000000000000000000000000

cosmoaddr2line /home/enol/sd-docker/llamafile/llamafile-0.8.16 545066 43390a 42d23a 4b7775 8ce154 8de754 9369e7

note: can't find addr2line on path or in ADDR2LINE
7f9c6622a7c0 545066 llama_n_ctx+6
7f9c6622d8a0 43390a llama_server_context::load_model(gpt_params const&)+396
7f9c6622d970 42d23a server_cli(int, char**)+3318
7f9c6622ff50 4b7775 server_thread(void*)+53
7f9c6622ff60 8ce154 PosixThread+132
7f9c6622ffb0 8de754 LinuxThreadEntry+36
7f9c6622ffd0 9369e7 sys_clone_linux+39

000000400000-000000a811e0 r-x-- 6660kb
000000a82000-0000031de000 rw--- 39mb
0006fe000000-0006fe001000 rw-pa 4096b
7f95528df000-7f9a96fffe00 r--s- 21gb
7f9b04000000-7f9b04200000 rw-pa 2048kb
7f9b04400000-7f9b06600000 rw-pa 34mb
7f9b1c000000-7f9b1cc00000 rw-pa 12mb
7f9b5fe00000-7f9b60000000 rw-pa 2048kb
7f9c6621c000-7f9c6621d000 ---pa 4096b
7f9c6621d000-7f9c66230000 rw-pa 76kb
7f9c662d5000-7f9c662d5fc0 rw-pa 4032b
7f9d415c4000-7f9d416ea400 rw-pa 1177kb
7f9d416eb000-7f9d4ace0f66 r--s- 150mb
7f9d4ace1000-7f9d4ade2000 rw-pa 1028kb
7fff8d694000-7fff8de94000 rw--- 8192kb
# 22'891'671'552 bytes in 15 mappings


./llamafile-0.8.16 -m gemma-2-27b-it-Q6_K_L.gguf -ngl 10 
Segmentation fault (core dumped)

I have everything needed for llamafile to compile and run on AMD set using environment variables.

Update: just checked and the latest working version for me is llamafile-0.8.13, everything else results in segfault.

@abishekmuthian
Copy link
Author

@OEvgeny Are you having low VRAM but large RAM setup? My issue is specifically not being able to offload to system memory when I use large models which Ollama seems to be capable of.

@OEvgeny
Copy link

OEvgeny commented Nov 26, 2024

I have 16/64. Had similar issue, although I built from the latest main and all works great from there. It is pretty easy to do.

@JohnnySn0w
Copy link

JohnnySn0w commented Dec 27, 2024

Just piping in some more data here:

Same error on versions >=0.8.14 when using gpu. 0.8.13 is last working version for running on gpu. Otherwise, I can run on cpu just fine(slooooow).

This is tested on an Arch machine.

@OEvgeny
Copy link

OEvgeny commented Dec 27, 2024

Doesn't 0.8.17 work for you?

@OEvgeny
Copy link

OEvgeny commented Dec 27, 2024

Also if something changed in your system, you should remove ~/.llamafile folder and try again

@JohnnySn0w
Copy link

I have not tried to compile the source. I can do that and get back to ya.

@OEvgeny
Copy link

OEvgeny commented Jan 3, 2025

@JohnnySn0w I checked briefly when v 0.8.17 got released and I think it fixed the issue for me. Here is the link to the release: https://github.com/Mozilla-Ocho/llamafile/releases/tag/0.8.17

There's no need to build from source to fix the problem anymore it seems.

@OEvgeny
Copy link

OEvgeny commented Jan 16, 2025

I'm seeing similar issue on any model on 0.9.0 now (after the gpu support rebuilt from the binary):

./llamafile-0.9.0 -m ../Impish_LLAMA_3B-Q8_0.gguf -ngl 10 -c 4096 --temp 0.8 --host 127.0.0.1 --port 8080

██╗     ██╗      █████╗ ███╗   ███╗ █████╗ ███████╗██╗██╗     ███████╗
██║     ██║     ██╔══██╗████╗ ████║██╔══██╗██╔════╝██║██║     ██╔════╝
██║     ██║     ███████║██╔████╔██║███████║█████╗  ██║██║     █████╗
██║     ██║     ██╔══██║██║╚██╔╝██║██╔══██║██╔══╝  ██║██║     ██╔══╝
███████╗███████╗██║  ██║██║ ╚═╝ ██║██║  ██║██║     ██║███████╗███████╗
╚══════╝╚══════╝╚═╝  ╚═╝╚═╝     ╚═╝╚═╝  ╚═╝╚═╝     ╚═╝╚══════╝╚══════╝
 launching server...
error: Uncaught SIGSEGV (SEGV_MAPERR) at 0x618 on homelab pid 18667 tid 18678
  ./llamafile-0.9.0
  No error information
  Linux Cosmopolitan 4.0.2 MODE=x86_64; #1-NixOS SMP PREEMPT_DYNAMIC Fri Nov 22 14:38:37 UTC 2024 homelab 6.6.63

RAX 0000000000000000 RBX 00007f94c7a80838 RDI 00007f9663efc368
RCX 0000000000000618 RDX 0000000000000001 RSI 00007f9663efc368
RBP 00007f9663efcc90 RSP 00007f9663efc2c0 RIP 00007f973f395316
 R8 0000000000000007  R9 0000000000000005 R10 00007f94cb9f3e20
R11 00007f973f4e2ac0 R12 00007f94c7a807f0 R13 0000000000000000
R14 0000000000000000 R15 00007f9663efc368
TLS 00007f95180293c0

XMM0  00000000000000000000000000000000 XMM8  00000000000000000000000000000000
XMM1  00000000000000000000000000800000 XMM9  ffffffffffffffffffffffffffffffff
XMM2  00000000000000000000000000000190 XMM10 ffffffffffffffffffffffffffffffff
XMM3  726f74636576632f73656c706d617865 XMM11 00000000000000000000000000000000
XMM4  000000000000000000007f9514000090 XMM12 00000000000000000000000000000000
XMM5  7c2031203d20323135585641207c2031 XMM13 ffffffffffffffffffffffffffffffff
XMM6  203d2032585641207c2030203d20494e XMM14 00000000000000000000000000000000
XMM7  4e565f585641207c2031203d20585641 XMM15 00000000000000000000000000000000

cosmoaddr2line /home/enol/self-llamafile/bin/llamafile-0.9.0 7f973f395316 4d74f6 55efc8 447822 4411bd 4cc4e5 90ca45 91ee5e 986fa1

note: can't find addr2line on path or in ADDR2LINE
7f9663ef91c0 7f973f395316 NULL+0
7f9663efcc90 4d74f6 ggml_backend_alloc_ctx_tensors_from_buft+387
7f9663efcd20 55efc8 llama_new_context_with_model+6296
7f9663efd7f0 447822 llama_server_context::load_model(gpt_params const&)+286
7f9663efd940 4411bd server_cli(int, char**)+3339
7f9663efff50 4cc4e5 lf::chatbot::server_thread(void*)+53
7f9663efff60 90ca45 PosixThread+197
7f9663efffb0 91ee5e AmdLinuxThreadEntry+30
7f9663efffd0 986fa1 sys_clone_linux+33

000000400000-000000ae21e0 r-xi- 7048kb
000000ae3000-000003251000 rw-i- 39mb
000003251000-0006fe000000       28gb
0006fe000000-0006fe001000 rw-pa 4096b
0006fe001000-7f93da018000       128tb
7f93da018000-7f94a57ffae0 r--s- 3256mb
7f94a5800000-7f95008c5000       1457mb
7f95008c5000-7f9503cc5000 rw-pa 52mb
7f9503cc5000-7f9518000000       323mb
7f9518000000-7f9519200000 rw-pa 18mb
7f9519200000-7f9560000000       1134mb
7f9560000000-7f9560200000 rw-pa 2048kb
7f9560200000-7f9663eeb000       4157mb
7f9663eeb000-7f9663eec000 ---pa 4096b
7f9663eec000-7f9663f00000 rw-pa 80kb
7f9663f00000-7f9663fa5000       660kb
7f9663fa5000-7f9663fa5980 rw-pa 2432b
7f9663fa6000-7f973f543000       3510mb
7f973f543000-7f973f67d5f0 rw-pa 1257kb
7f973f67e000-7f974d1387e7 r--s- 219mb
7f974d139000-7f974d239000 rw-pa 1024kb
7f974d239000-7ffe85a36000       413gb
7ffe85a36000-7ffe85b36000 ---pa 1024kb
7ffe85b36000-7ffe86336000 rw-pa 8192kb
# 3'779'354'624 bytes in 15 mappings


./llamafile-0.9.0 -m ../Impish_LLAMA_3B-Q8_0.gguf -ngl 10 -c 4096 --temp 0.8 --host 127.0.0.1 --port 8080 
Segmentation fault (core dumped)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants