-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: fix scout deadlock #188
Conversation
Releasing GIL in Scout destructor prevents the deadlock.
This fix brings the question of the hidden side effects of |
Hi @wyfo , this is a nice fix by yielding the GIL. I try to visualize the fix. Please correct me if I'm wrong. sequenceDiagram
participant C as GIL
participant A as Python
participant B as Rust
participant D as Rust
box zenoh-python
participant A
participant B
participant C
end
box zenoh-rust
participant D
end
A ->> A: scout.stop
A ->> C: Acquire lock
C ->> A: Return lock
activate C
A ->> C: Release lock <br> (This fixes the issue)
deactivate C
A ->> B: _Scout::drop
B ->> D: ScoutInner::drop
D ->> B: PyClosure::drop
B ->> C: Acquire lock
C ->> B: Return lock
activate C
B ->> A: Closure.drop
A ->> B: Closure.drop <br> return
B ->> C: Release lock
deactivate C
B ->> D: PyClosure::drop <br> return
D ->> B: ScoutInner::drop <br> return
B ->> A: _Scout::drop <br> return
And my another fix in zenoh rust looks like this. sequenceDiagram
participant C as GIL
participant A as Python
participant B as Rust
participant D as Rust
box zenoh-python
participant A
participant B
participant C
end
box zenoh-rust
participant D
end
A ->> A: scout.stop
A ->> C: Acquire lock
C ->> A: Return lock
activate C
A ->> B: _Scout::drop
B ->> D: ScoutInner::drop
D ->> B: ScoutInner::drop <br> return
B ->> A: _Scout::drop <br> return
A ->> C: Release lock
deactivate C
D ->> D: Swtich to another thread
D ->> B: PyClosure::drop
B ->> C: Acquire lock
C ->> B: Return lock
activate C
B ->> A: Closure.drop
A ->> B: Closure.drop <br> return
B ->> C: Release lock
deactivate C
B ->> D: PyClosure::drop <br> return
|
You graphs need one correction: the GIL is not acquired by With your solution, I suspect the Python destructor could in fact not even be executed, because the main thread could terminate before the execution of the spawned thread.
The issue is that we don't control the cleanup, because it's done on zenoh size, with the Python object embedded, and we don't control zenoh objects. I mean, it could be possible by storing the Python callback into some `Arc<Mutex<Option>>, and manually extract the Python object to prevent zenoh to call the Python drop callback, but it's quite a lot of complexity when releasing the GIL is enough. |
Yes. That's what I said, "non-deterministic".
I see. Then we might need to perform all the similar drop with this trick in zenoh-python. |
You want me to update the drop of the other classes? |
Let's launch a discussion with the others first in the channel. I think others may want to learn more details about this problem. To me, this PR has already been good to merge. |
* chore: bump pyo3 to 0.21 (#175) Roughly adapt the code to the new PyO3 API, with a few warning fixes. * fix(examples): fix commented parts of z_put.py (#176) * fix(examples): fix commented parts of z_put.py * fix: apply PR reviews * fix: apply PR reviews * Correct the syntax of pyproject.toml (#181) * feat: Automate Release (#165) * build: Get Read the Docs release number from Cargo manifest This avoids hardcoding the release number in the documentation build config, making it easier to bump the version by only modifying the manifest. * build: Require Python 3.11 in Read the Docs configuration Python 3.11 is needed to access tomllib; useful for parsing the Cargo manifest file. * feat: Create branch, bump version and tag in release workflow * feat: Add publish-github job * fix: Broken tag dependencies * chore: Remove enforce-linking-issues workflow * fix: Bump version in pyproject.toml * chore: Upgrade artifact actions from v3 to v4 * fix: Typo in git-commit command * fix: Support jq 1.6 ubuntu-22.04 runners use jq 1.6 which doesn't recognize a dot for `[]` value iterator. See: jqlang/jq#1168. * Revert "chore: Upgrade artifact actions from v3 to v4" This reverts commit a535971. * fix: Build wheels from release branch * fix: Switch to pypa/gh-action-pypi-publish@release/v1 The older actions doesn't recognize the pyproject.toml metadata fields. * Force sphinx version to 7.2.6 (#185) * fix: Install maturin according to requirements-dev.txt (#186) * fix: fix scout deadlock (#188) Releasing GIL in Scout destructor prevents the deadlock. * fix(uhlc): bump uhlc version (#173) Signed-off-by: gabrik <[email protected]> Co-authored-by: Luca Cominardi <[email protected]> * feat(tracing): using zenoh-util function for log initialization (#172) * feat(tracing): using zenoh-util function for log initialization Signed-off-by: gabrik <[email protected]> * chore: adding Cargo.lock Signed-off-by: gabrik <[email protected]> * feat(tracing): using zenoh main Signed-off-by: gabrik <[email protected]> * chore: using new try_init_log_from_env Signed-off-by: gabrik <[email protected]> * chore: sync Cargo.lock Signed-off-by: gabrik <[email protected]> --------- Signed-off-by: gabrik <[email protected]> Co-authored-by: Luca Cominardi <[email protected]> * feat: add Attachment to API (#189) * feat: add Attachment to API * fix: make attachment a CLI arg in z_put example * Update zenoh/session.py Co-authored-by: Luca Cominardi <[email protected]> * Apply suggestions from code review Co-authored-by: Luca Cominardi <[email protected]> * test: add test for attachment --------- Co-authored-by: Luca Cominardi <[email protected]> * build: Sync with eclipse-zenoh/zenoh@81217c7 from 2024-04-22 (#190) Co-authored-by: eclipse-zenoh-bot <[email protected]> * fix: Create virtual environment for armv6 build (#191) * build: Sync with eclipse-zenoh/zenoh@2fdddae from 2024-04-23 (#192) Co-authored-by: eclipse-zenoh-bot <[email protected]> * build: Sync with eclipse-zenoh/zenoh@9a9832a from 2024-04-24 (#193) Co-authored-by: eclipse-zenoh-bot <[email protected]> * build: Sync with eclipse-zenoh/zenoh@7c64d99 from 2024-04-26 (#194) Co-authored-by: eclipse-zenoh-bot <[email protected]> * build: Sync with eclipse-zenoh/zenoh@ad58af6 from 2024-04-26 (#195) Co-authored-by: eclipse-zenoh-bot <[email protected]> * build: Sync with eclipse-zenoh/zenoh@e8916bf from 2024-04-26 (#196) Co-authored-by: eclipse-zenoh-bot <[email protected]> * build: Sync with eclipse-zenoh/zenoh@ea604b6 from 2024-04-29 (#197) Co-authored-by: eclipse-zenoh-bot <[email protected]> * build: Sync with eclipse-zenoh/zenoh@371ca6b from 2024-04-30 (#198) Co-authored-by: eclipse-zenoh-bot <[email protected]> * build: Sync with eclipse-zenoh/zenoh@7a47445 from 2024-05-03 (#200) Co-authored-by: eclipse-zenoh-bot <[email protected]> * build: Sync with eclipse-zenoh/zenoh@f5195c0 from 2024-05-03 (#202) Co-authored-by: eclipse-zenoh-bot <[email protected]> * build: Sync with eclipse-zenoh/zenoh@e53364f from 2024-05-04 (#203) Co-authored-by: eclipse-zenoh-bot <[email protected]> * test: add z_ping/z_pong examples (#205) * fix: make attachment API more similar to dict API (#201) * fix: make attachment API more similar to dict API * fix: fix formatting * fix: fix formatting * fix: fix typing * fix: uncomment tests 😅 * docs: add docstring for Attachment * build: Sync with eclipse-zenoh/zenoh@7e5d5e8 from 2024-05-07 (#206) Co-authored-by: eclipse-zenoh-bot <[email protected]> * fix: add `Encoding.with_suffix` instead of `Encoding.append` (#208) * fix: add `Encoding.with_suffix` instead of `Encoding.append` * feat: add `prefix`/`suffix` methods to `Encoding` * build: Sync with eclipse-zenoh/zenoh@b8dd01d from 2024-05-07 (#209) Co-authored-by: eclipse-zenoh-bot <[email protected]> * Support timeout for get in session. Signed-off-by: ChenYing Kuo <[email protected]> * Update PR based on the review. Signed-off-by: ChenYing Kuo <[email protected]> * Fix CI lint error. Signed-off-by: ChenYing Kuo <[email protected]> * Update: PyTypeError => PyValueError Signed-off-by: ChenYing Kuo <[email protected]> * build: Sync with eclipse-zenoh/zenoh@45e05f0 from 2024-05-13 (#212) Co-authored-by: eclipse-zenoh-bot <[email protected]> * build: Sync with eclipse-zenoh/zenoh@763a05f from 2024-05-14 (#213) Co-authored-by: eclipse-zenoh-bot <[email protected]> * build: Sync with eclipse-zenoh/zenoh@75aa273 from 2024-05-15 (#214) Co-authored-by: eclipse-zenoh-bot <[email protected]> * build: Sync with eclipse-zenoh/zenoh@25f06bd from 2024-05-21 (#215) Co-authored-by: eclipse-zenoh-bot <[email protected]> * build: Sync with eclipse-zenoh/zenoh@3118d31 from 2024-05-28 (#216) Co-authored-by: eclipse-zenoh-bot <[email protected]> * build: Sync with eclipse-zenoh/zenoh@d574654 from 2024-06-03 (#217) Co-authored-by: eclipse-zenoh-bot <[email protected]> * chore: Update artifacts action to v4 (#218) artifacts actions v3 are deprecated * fix: give unique name for artifacts (#220) upload-artifacts/v4 has a breaking change from v3, which requires the artifacts names to be unique. Fix #219 * build: Sync with eclipse-zenoh/zenoh@c279982 from 2024-06-05 (#221) Co-authored-by: eclipse-zenoh-bot <[email protected]> * build: Sync with eclipse-zenoh/zenoh@d8e66de from 2024-06-10 (#222) Co-authored-by: eclipse-zenoh-bot <[email protected]> * build: Sync with eclipse-zenoh/zenoh@9d09742 from 2024-06-11 (#223) Co-authored-by: eclipse-zenoh-bot <[email protected]> * Fix markdown format in README. (#211) Signed-off-by: ChenYing Kuo <[email protected]> * build: Sync with eclipse-zenoh/zenoh@ed6c636 from 2024-06-12 (#225) Co-authored-by: eclipse-zenoh-bot <[email protected]> * build: Sync with eclipse-zenoh/zenoh@8160b01 from 2024-06-13 (#228) Co-authored-by: eclipse-zenoh-bot <[email protected]> * Enable releasing from any branch (#227) * build: Sync with eclipse-zenoh/zenoh@7adad94 from 2024-06-14 (#230) Co-authored-by: eclipse-zenoh-bot <[email protected]> * build: Sync with eclipse-zenoh/zenoh@93f93d2 from 2024-06-17 (#231) Co-authored-by: eclipse-zenoh-bot <[email protected]> * build: Sync with eclipse-zenoh/zenoh@2500e5a from 2024-06-20 (#236) Co-authored-by: eclipse-zenoh-bot <[email protected]> * fix: remove deprecated maturin arg (#241) See PyO3/maturin#1620, `--universal2` has been deprecated * build: Sync with eclipse-zenoh/zenoh@869ace6 from 2024-07-02 (#247) Co-authored-by: eclipse-zenoh-bot <[email protected]> * build: Sync with eclipse-zenoh/zenoh@b93ca84 from 2024-07-03 (#248) Co-authored-by: eclipse-zenoh-bot <[email protected]> * build: Sync with eclipse-zenoh/zenoh@b3e42ce from 2024-07-08 (#252) Co-authored-by: eclipse-zenoh-bot <[email protected]> * build: Sync with eclipse-zenoh/zenoh@b3e42ce from 2024-07-08 (#253) Co-authored-by: eclipse-zenoh-bot <[email protected]> * build: Sync with eclipse-zenoh/zenoh@b3e42ce from 2024-07-08 (#259) Co-authored-by: eclipse-zenoh-bot <[email protected]> * build: Sync with eclipse-zenoh/zenoh@0a969cb from 2024-07-25 (#261) Co-authored-by: eclipse-zenoh-bot <[email protected]> * build: Sync with eclipse-zenoh/zenoh@0a969cb from 2024-07-25 (#262) Co-authored-by: eclipse-zenoh-bot <[email protected]> * build: Sync with eclipse-zenoh/zenoh@e587aa9 from 2024-07-26 (#265) Co-authored-by: eclipse-zenoh-bot <[email protected]> * build: Sync with eclipse-zenoh/zenoh@2d88c7b from 2024-07-29 (#268) Co-authored-by: eclipse-zenoh-bot <[email protected]> * chore: remove useless dependencies --------- Signed-off-by: gabrik <[email protected]> Signed-off-by: ChenYing Kuo <[email protected]> Co-authored-by: Yuyuan Yuan <[email protected]> Co-authored-by: Mahmoud Mazouz <[email protected]> Co-authored-by: oteffahi <[email protected]> Co-authored-by: Gabriele Baldoni <[email protected]> Co-authored-by: Luca Cominardi <[email protected]> Co-authored-by: eclipse-zenoh-bot <[email protected]> Co-authored-by: eclipse-zenoh-bot <[email protected]> Co-authored-by: ChenYing Kuo <[email protected]> Co-authored-by: kydos <[email protected]> Co-authored-by: Diogo Matsubara <[email protected]> Co-authored-by: Diogo Matsubara <[email protected]>
Closes #180
Releasing GIL in Scout destructor prevents the deadlock.