Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document issue with shutdown #283

Merged
merged 4 commits into from
Oct 1, 2024
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 23 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -131,3 +131,26 @@ For instance:
- `RUST_LOG=zenoh=debug` activates all the debug logs.

For more information on the `RUST_LOG` syntax, see https://docs.rs/env_logger/latest/env_logger/#enabling-logging.

### Known Issues

### Crash when program terminates

When a program terminates, global and static objects are destructed in the reverse order of their
construction. The `Thread Local Storage` is one such entity which the `tokio` runtime in Zenoh
uses. If the Zenoh session is closed after this entity is cleared, it causes a panic like seen below.
Yadunund marked this conversation as resolved.
Show resolved Hide resolved

```
thread '<unnamed>' panicked at /rustc/aedd173a2c086e558c2b66d3743b344f977621a7/library/std/src/thread/local.rs:262:26:
cannot access a Thread Local Storage value during or after destruction: AccessError
```

This can happen with `rmw_zenoh` if a ROS 2 node's `Context` is not shutdown explicitly before the
program terminates. In this scenario, the `Context` will be shutdown inside the `Context`'s destructor
which then closes the Zenoh session. Since the ordering of global/static objects is not clear, this
often leads to the above panic.
Yadunund marked this conversation as resolved.
Show resolved Hide resolved

The recommendation is to ensure the `Context` is shutdown before a program terminates.
For example, when using `rclcpp`, ensure `rclcpp::shutdown()` is invoked the program exits.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to be careful with this advice.

In particular, if you are writing composable nodes, you basically never want to call rclcpp::shutdown. That's because rclcpp::shutdown is a global operation, and will shut down your node and all other nodes that happen to be composed with your node at the same time. That's probably not what is intended.

The composable container should call rclcpp::shutdown when it is going down, but that is a different thing.

Overall, I'm not saying that you are wrong, but there is subtle distinction here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't composable nodes be created as shared libraries without a main function and hence lack the opportunity to invoke rclcpp::init() and rclcpp::shutdown()? Only executables that implement compile-time composition would include these statements and the advise above should be valid in this scenario?

The default main impl for composable nodes includes an rclcpp::shutdown() https://github.com/ros2/rclcpp/blob/a78d0cbd33b8fe0b4db25c04f7e10017bfca6061/rclcpp_components/src/node_main.cpp.in#L78

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy to massage the wording in any other way!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't composable nodes be created as shared libraries without a main function and hence lack the opportunity to invoke rclcpp::init() and rclcpp::shutdown()?

I mean, they shouldn't call it, but that doesn't mean they won't. rclcpp::shutdown, in particular, is often (erroneously) called when a node is done with its work; even our demos are guilty of this: https://github.com/ros2/demos/blob/0f2afe53be38b71c01d43f0900e18187d2e36082/demo_nodes_cpp/src/services/add_two_ints_client_async.cpp#L65

I'm going to suggest some wording to point out that composable nodes shouldn't generally do this.

Yadunund marked this conversation as resolved.
Show resolved Hide resolved

For more details, see https://github.com/ros2/rmw_zenoh/issues/170.