Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: redbean: iterating unix.opendir can lead to random crashes with SIGSEGV SIGV_ACCERR #1337

Open
andrei-markeev opened this issue Dec 14, 2024 · 3 comments
Labels
medium severity Used to report medium severity bugs (e.g. Malfunctioning Features but still useable)

Comments

@andrei-markeev
Copy link

andrei-markeev commented Dec 14, 2024

Contact Details

No response

What happened?

Some time ago, I started getting random crashes with SIGSERV SIGV_ACCERR error. At first they weren't happening often, but then as I started doing more and more file system operations, they became very common (and very annoying 😅).

Crashes were occuring quite randomly, at different places, once I even had a crash that happened while server was idle.

One thing was common: the stack trace would always end with LuaUnixDirClose and closedir.

When dealing with file system, my typical approach is to crawl a directory recursively with for name, kind in assert(unix.opendir(dir or '.')) do ... end, collecting a list of files into a table, and then do some operations with those files. For example, copy a bunch of files somewhere, concatenate them into one file, etc.

Running with --strace showed that the crash was happening during garbage collection. A bunch of file descriptors were closed, last of them failing:

(munmap calls skipped for brevity, full log attached below)

SYS  27588  41396        358'171'690 close(17) → 0
SYS  27588  41396        359'441'548 close(16) → 0
SYS  27588  41396        360'679'756 close(15) → 0
SYS  27588  41396        361'909'983 close(14) → 0
SYS  27588  41396        363'203'811 close(11) → 0
SYS  27588  41396        364'438'224 close(9) → 0
SYS  27588  41396        365'668'176 close(8) → 0
SYS  27588  41396        366'880'838 close(13) → 0
SYS  27588  41396        367'277'431 win32 vectored exception 0xC0000005u raising SIGSEGV cosmoaddr2line /C/my/Projects/pebble-outlook-events/cloudpebble-portable.com 7ffa5413faad 5e8c0a 52d2b5 507178 5074a2 508da0 50627f 507890 508d41 50a9b5 50b095 4ffa18 502efb 441d44 507178 524b07 507360 50627f 5075bf 51ec99 42054d 430bef
SYS  27588  41396        367'776'380 gethostname(["LAPTOP-D32Q4JCL"], 64) → 0
SYS  27588  41396        368'205'609 uname([{"Windows", "LAPTOP-D32Q4JCL", "10.0", "Cosmopolitan 3.9.7 MODE=x86_64", "x86_64", ""}]) → 0

error: Uncaught SIGSEGV (SEGV_ACCERR) at 0x7ffa5413faad on LAPTOP-D32Q4JCL pid 27588 tid 41396
  cloudpebble-portable.com
  No error information (win32 error 38)
  Windows Cosmopolitan 3.9.7 MODE=x86_64 LAPTOP-D32Q4JCL 10.0

After tracking file descriptors, I found that there were 2 descriptors pointing to the same folder:

SYS  36116  34700        184'280'678 openat(AT_FDCWD, "src/pkjs/js", O_RDONLY|O_CLOEXEC|O_DIRECTORY) → 13
SYS  26916  22804        127'355'271 openat(AT_FDCWD, "src/pkjs/js", O_RDONLY|O_CLOEXEC|O_DIRECTORY) → 17

Interestingly, those two openat calls belong to different requests and different API endpoints.

I think this was what caused crashes - trying to close same fd twice.

I changed my code to explicitly close the Dir, and crashes stopped.

local dirInfo = assert(unix.opendir(dir or '.'))
for name, kind in dirInfo do
    -- ...
end
assert(dirInfo:close())

Version

redbean 3.0.0, also tried with a version from cosmo.zip from 12-Dec-2024

What operating system are you seeing the problem on?

Windows

Relevant log output

Full log doesn't fit, I had to attach it as a file: strace2_full.txt

error: Uncaught SIGSEGV (SEGV_ACCERR) at 0x7ffa5413faad on LAPTOP-D32Q4JCL pid 26916 tid 22804
  cloudpebble-portable.com
  No error information (win32 error 38)
  Windows Cosmopolitan 3.9.7 MODE=x86_64 LAPTOP-D32Q4JCL 10.0

RAX 000000000097d000 RBX 00003c79dbad0010 RDI 00000000002623e0
RCX 0000000000262408 RDX 0000000000000000 RSI 000000000000004f
RBP 00007000007dd5b0 RSP 00007000007dd520 RIP 00007ffa5413faad
 R8 00007000007db4d8  R9 00007000007db590 R10 0000000000000000
R11 0000000000000246 R12 0000000000000000 R13 00003c79dbad0010
R14 0000000000262408 R15 00003c79db722290
TLS 0000000000705f40

XMM0  00000000000000000000000000000000 XMM8  00000000000000000000000000000000
XMM1  00000000000000000000000000000000 XMM9  00000000000000000000000000000000
XMM2  00000000000000000000000000000000 XMM10 00000000000000000000000000000000
XMM3  00000000000000000000000000000000 XMM11 00000000000000000000000000000000
XMM4  00000000000000000000000000000000 XMM12 00000000000000000000000000000000
XMM5  00000000000000000000000000000000 XMM13 00000000000000000000000000000000
XMM6  73656c626265702f656c626265702e2f XMM14 00000000000000000000000000000000
XMM7  7665656b72616d2f73726573552f432f XMM15 00000000000000000000000000000000

cosmoaddr2line /C/my/Projects/pebble-outlook-events/cloudpebble-portable.com 7ffa5413faad 5e8c0a 52d2b5 507178 5074a2 508da0 50627f 507890 508d41 50a9b5 50b095 4ffa18 502efb 441d44 507178 524b07 507360 50627f 5075bf 51ec99 42054d 430bef

000000740b00 7ffa5413faad NULL+0
7000007dd5b0 5e8c0a closedir+74
7000007dd5d0 52d2b5 LuaUnixDirClose+53
7000007dd600 507178 luaD_precall+408
7000007dd660 5074a2 luaD_callnoyield+50
7000007dd6a0 508da0 dothecall+16
7000007dd6b0 50627f luaD_rawrunprotected+79
7000007dd730 507890 luaD_pcall+48
7000007dd790 508d41 GCTM+273
7000007dd7d0 50a9b5 singlestep+645
7000007dd810 50b095 luaC_step+149
7000007dd860 4ffa18 lua_pushlstring+72
7000007dd880 502efb luaL_pushresult+59
7000007dd8b0 441d44 LuaSlurp+452
7000007de5c0 507178 luaD_precall+408
7000007de620 524b07 luaV_execute+2215
7000007de6b0 507360 resume+176
7000007de6e0 50627f luaD_rawrunprotected+79
7000007de760 5075bf lua_resume+143
7000007de7a0 51ec99 LuaCallWithTrace+105
7000007de800 42054d LuaCallWithYield+45
7000007de870 430bef ServeLua+159
7000007de8b0 433193 RoutePath+371
7000007de8e0 433a56 Route+134
7000007de900 43486c LuaRoute+92
7000007de930 507178 luaD_precall+408
7000007de990 524b07 luaV_execute+2215
7000007dea20 507360 resume+176
7000007dea50 50627f luaD_rawrunprotected+79
7000007dead0 5075bf lua_resume+143
7000007deb10 51ec99 LuaCallWithTrace+105
7000007deb70 42054d LuaCallWithYield+45
7000007debe0 433fbb HandleRequest+1163
7000007dec40 4341ef HandleMessageActual+111
7000007dec90 434b1c HandleMessages+236
7000007decc0 435637 HandleConnection.isra.0+1015
7000007decf0 435fcd HandlePoll+141
7000007ded20 43641b EventLoop+635
7000007dee00 436c16 RedBean+1830
7000007deea0 404a74 main+68
7000007deec0 405124 cosmo+68
7000007deed0 61f638 __stack_call+16

000000320000-000000330000 rw-Pa 64kb
000000330000-000000340000 rw-Pa 64kb hand=408
000000400000-00000069e108 r-x-- 2680kb
00000069f000-000000753000 rw--- 720kb
0006fe000000-0006fe010000 rw-pa 64kb hand=276
3c79d9fb0000-3c79daa5cb51 r--s- 11mb hand=340 readonlyfile
3c79daa60000-3c79daa60f48 rw-pa 3912b hand=280
3c79daa70000-3c79daacf000 rw-pa 380kb hand=284
3c79daad0000-3c79daae0000 rw-pa 64kb hand=288
3c79daae0000-3c79daaf0000 rw-pa 64kb hand=292
3c79daaf0000-3c79dab10000 rw-pa 128kb hand=296
3c79dab10000-3c79dab20000 rw-sa 64kb hand=356
3c79dab20000-3c79dab20010 rw-pa 16b hand=300
3c79dab30000-3c79dab30028 rw-sa 40b hand=368
3c79dab40000-3c79db5ecb51 r--p- 11mb hand=304 cow
# 36'990'976 bytes in 42 mappings

  0 kFdConsole handle=80
  1 kFdConsole flags=O_WRONLY|O_APPEND handle=84
  2 kFdConsole flags=O_WRONLY|O_APPEND handle=88
  3 kFdFile handle=360
  4 kFdSocket flags=O_RDWR|O_CLOEXEC mode=0140666 handle=468
  5 kFdSocket flags=O_RDWR|O_CLOEXEC mode=0140666 handle=516
  6 kFdSocket flags=O_RDONLY|O_CLOEXEC mode=0140666 handle=444
  7 kFdFile flags=O_RDONLY|O_CLOEXEC|O_DIRECTORY handle=396
 10 kFdFile flags=O_RDONLY|O_CLOEXEC|O_DIRECTORY handle=436
 12 kFdFile flags=O_RDONLY|O_CLOEXEC|O_DIRECTORY handle=480
@andrei-markeev andrei-markeev added the medium severity Used to report medium severity bugs (e.g. Malfunctioning Features but still useable) label Dec 14, 2024
@andrei-markeev andrei-markeev changed the title Bug: redbean: iterating unix.opendir can lead to random crashes with SIGSERV SIGV_ACCERR Bug: redbean: iterating unix.opendir can lead to random crashes with SIGSEGV SIGV_ACCERR Dec 14, 2024
@jart
Copy link
Owner

jart commented Dec 24, 2024

I can't reproduce this. I need you to give me example code I can run with ./redbean -i foo.lua that causes this crash. For example:

local function foo()
    local dirInfo = assert(unix.opendir(dir or '.'))
    for name, kind in dirInfo do
        print(name)
    end
end

foo()
collectgarbage("collect")

Works fine. Same with explicit closes.

I don't believe it's possible for closedir() to be called twice because:

// unix.Dir:close()
//     ├─→ true
//     └─→ nil, unix.Errno
static int LuaUnixDirClose(lua_State *L) {
  DIR **dirp;
  int rc, olderr;
  dirp = GetUnixDirSelf(L);
  if (*dirp) {
    olderr = errno;
    rc = closedir(*dirp);
    *dirp = 0;
    return SysretBool(L, "closedir", olderr, rc);
  } else {
    lua_pushboolean(L, true);
    return 1;
  }
}

@andrei-markeev
Copy link
Author

I tried a couple of simple things and it is not easily reproducible indeed. In my original code, it was reproducing almost on every run. I will work towards a minimal example. It might take some time though because of Christmas.

@jart
Copy link
Owner

jart commented Dec 24, 2024

Take your time. Please also check if this impacts redbean at head. You can build it as follows:

git clone https://github.com/jart/cosmopolitan cosmo
cd cosmo
make -j8 o//tool/net/redbean
o//tool/net/redbean -i myscript.lua

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
medium severity Used to report medium severity bugs (e.g. Malfunctioning Features but still useable)
Projects
None yet
Development

No branches or pull requests

2 participants