-
Notifications
You must be signed in to change notification settings - Fork 214
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix issues that caused gamescope to abort at exit #1335
base: master
Are you sure you want to change the base?
Conversation
41000c4
to
61392a9
Compare
|
just as a heads up @sharkautarch this PR causes both a gamescope and an xwayland coredump once the "Switch to Desktop" shortcut is used within gamescope-session on my 7900XTX on Fedora 40. this results in a 20-40 second freeze before the session switch executes properly and returns me to desktop. gamescope-coredump.log edit: also, the same issue happens when applied against upstream gamescope master so you can disregard the gamescope-plus logging in gamescope-stdout |
@matte-schwartz Then, try running gamescope with these environment variables to run w/ address sanitizer: Note that when running w/ address sanitizer, coredumping is disabled, because otherwise the coredumps would be terabytes in size… I’m not sure if you might need to actually recompile gamescope w/ address sanitizer instrumentation in order to catch whatever issue is going on… |
I gave this a try but keep getting an error about needing to link asan at runtime, even though I built gamescope following your instructions below. Let me read up a bit on how these tools are supposed to work so I can try and figure out what's going wrong.
|
@matte-schwartz If so, try running ASAN again, but this time with the |
61392a9
to
35f5ba8
Compare
9ef7f05
to
9cda7ad
Compare
@matte-schwartz I revised this PR, so hopefully it no longer causes xwayland to coredump when trying to switch to desktop in gamescope-session, tho I haven't tested that specific case, so you'll have to test that yourself. I also ended up also having to address another issue I found in the process of revising this PR, where it appeared that gamescope may have sometimes been hanging onto a lock ( |
476a116
to
a8c9bc8
Compare
b59ae95
to
de8186f
Compare
also release the xwayland_server_guard lock before calling pthread_exit() to prevent gamescope from hanging at exit
de8186f
to
fecff7e
Compare
The first commit fixes issue #1305
Note: I used a user event instead of the sdl quit event, to ensure that the sdl thread is only closed from inside
steamcompmgr_exit()
After fixing the aforementioned issue, I noticed that gamescope would still sometimes abort/coredump at exit.
So I found a separate issue, wherein the present_wait thread, when running
GetVBlankTimer().MarkVBlank( vblanktime, true );
, would segfault upon trying to access a backend whilesteamcompmgr_exit()
was deleting said backend.The second commit fixes this w/ an awkward dance between the two threads, wheresteamcompmgr_exit()
asks the present_wait thread to exit, and then waits for the present_wait thread to respond back.The reason why I did that weird stuff in the second commit is that I wanted to minimize the overhead being added to
present_wait_thread_func()
: for most of the runtime, this commit will not add any delay topresent_wait_thread_func()
update: Slight edit to the second commit: insteamcompmgr_exit()
, I moved the atomic store tog_presentThreadShouldExit
so that it happens before the atomic store tog_currentPresentWaitId
, just to cover any rare edge case w/ atomic load/store timingEDIT:
I just revised this PR w.r.t. the second issue, and instead of trying to insert checks inside the present wait thread, I changed how backend creation and deletion worked, so that it is now safe to do
SetBackend()
while other threads are accessing the backend viaGetBackend()
.The added synchronization overhead should be pretty low, since the only change effecting
GetBackend()
andIBackend::Get()
is thats_pBackend
is now an atomic ptr, and the plain atomic load (even w/ the default seq_cst ordering) froms_pBackend
gets compiled down to a plain move instruction on x86_64 and on aarch64, it's only a load-acquire atomic instruction (LDAR instruction, which still allows non-atomic accesses to be reordered around it)I also fixed an additional issue that I noticed, where, at least when using SDL backend, gamescope would hang at exit (tho this isn't only able to be seen after fixing the two previous issue that would otherwise coredump gamescope at exit before it could hang).
For whatever reason, it seems like gamescope could hang at exit if the
xwayland_server_guard
lock was still locked when callingpthread_exit()
, so this PR addresses that issue by simply unlocking that lock before callingpthread_exit()
.Also disclaimer that when testing this PR w/ UBSAN, I noticed some warnings from UBSAN about vptr issues w.r.t.
CVBlankTimer
which only appeared while gamescope was exiting, and which don't seem to appear in upstream gamescope.Not sure if that's just a false positive, and I couldn't find anything wrong besides that.