-
Notifications
You must be signed in to change notification settings - Fork 362
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fsspec.asyn.sync shoud check the liveness of IO thread to avoid deadlocks #1723
Comments
Two additional remarks regarding this change. This mitigation is not perfectThis check is only applied when fsspec event loop is used with https://github.com/fsspec/adlfs/blob/2024.7.0/adlfs/spec.py#L2021 When we check the readiness of an event loop, we use As fsspec event loop has a certain unhealthy period potentially leading to deadlocks, it would be better to encourage developers to use the fsspec event loop only in conjunction with Errors produced by this check may be confusingThis check raises the exception only in a specific case ( However, this error can confuse users as it depends on the garbage collection of unclosed files, the timing of which is difficult to predict and control. It might be better to add an explanation about this error in the documentation. |
I might suggest wrapping the exceptions with We might want to avoid any sync calls during shutdown, but some connection pools really like to be shut down before exit, and produce various warnings of their own if not allowed to do so. Another reasonable check that might be done in sync(), is that the process ID is still what we started with (which will not be true following |
fsspec.asyn
creates and runs an event loop used by async file system implementations as the default event loop.However, this module does not explicitly close the event loop. As a result, when a Python interpreter enters the shutdown sequence, we experience a specific period during which the event loop is still marked as "running", but the IO thread running the loop has already stopped.
This period is dangerous as it can lead to unexpected deadlocks, particularly if there are unclosed files that potentially trigger file system access when they are garbage-collected at the final moment of the interpreter shutdown, as I reported in #1685.
To mitigate this risk, we should add a liveness check of the IO thread in
fsspec.asyn.sync()
widely used by async file system implementations to run async functions synchronously.The text was updated successfully, but these errors were encountered: