Application should abort on panic in any of the threads #619
Replies: 7 comments
-
There is another related case. WHen you start the tracker in Loading configuration from config file ./config.toml
2023-04-04T13:20:49.719450708+01:00 [torrust_tracker::bootstrap::logging][INFO] logging initialized.
2023-04-04T13:20:49.720843820+01:00 [torrust_tracker::app][WARN] Could not start UDP tracker on: 0.0.0.0:6969 while in Private. UDP is not safe for private trackers!
2023-04-04T13:30:49.722860551+01:00 [torrust_tracker::bootstrap::jobs::torrent_cleanup][INFO] Cleaning up torrents.. It is only a warning, but the UDP does not start at all. I think we should force the user to either disable the UDP tracker or change the tracker mode. What do you think @WarmBeer? |
Beta Was this translation helpful? Give feedback.
-
Maybe try this in release profile, it should abort once something panic
|
Beta Was this translation helpful? Give feedback.
-
Hey @josecelano , Sorry for the extremely late reply. I never saw this message 😅. Should we just replace the |
Beta Was this translation helpful? Give feedback.
-
Thank you for your suggestion @nyacat ! I have done a bit of testing and it does indeed seem to abort the main application if there is a panic in any of the Tokio tasks :). What do you think of using this @josecelano ? |
Beta Was this translation helpful? Give feedback.
-
Hi @WarmBeer, for that particular case, I think we only need to panic instead of warning: // Start the UDP blocks
for udp_tracker_config in &config.udp_trackers {
if !udp_tracker_config.enabled {
continue;
}
if tracker.is_private() {
warn!(
"Could not start UDP tracker on: {} while in {:?}. UDP is not safe for private trackers!",
udp_tracker_config.bind_address, config.mode
);
} else {
jobs.push(udp_tracker::start_job(udp_tracker_config, tracker.clone()));
}
} because that code is executed by the main thread. For the other cases, when we spawn a new task with Tokio I guess we should check the JoinHandle, it depends on what the spawned task returns. If it panics, we should maybe panic also the main thread, for example: #[tokio::main]
async fn main() -> io::Result<()> {
let join_handle: task::JoinHandle<Result<i32, io::Error>> = tokio::spawn(async {
panic!("boom");
});
let err = join_handle.await.unwrap_err();
assert!(err.is_panic());
Ok(())
} And @nyacat, regarding the https://users.rust-lang.org/t/what-happens-when-an-async-task-panics/68668 but I would prefer to catch it and handle it in the main thread. Right now, I think we prefer the "all or nothing" approach, which means we would stop all the services if we cannot run one of them. But in the future, that could depend on configuration settings, or we could restart some services from an admin panel. What I would do is review all the "Unrecoverable Errors" (with panic or error) and handle them on the main tread. The current implementation could just panic again. For example:
I think that's basically the original @WarmBeer's proposal |
Beta Was this translation helpful? Give feedback.
-
notes from discussion: |
Beta Was this translation helpful? Give feedback.
-
Relates to: #618 Hi @WarmBeer maybe in some cases we should not abort. For example, if the app is processing a request and that request is the one producing the panic. Maybe, we should specify concrete examples and open a new issue for each of them. |
Beta Was this translation helpful? Give feedback.
-
Currently, when there is a panic in any of our spawned tokio tasks, the application will continue to run. Only the specific task from where the panic originated will be aborted.
I believe that we should change this behaviour to abort the main runtime on any panic. Whether it is from the main runtime or from a spawned task.
The problem with the current behaviour is mainly that the app will continue to run even if one or multiple servers (HttpTracker, Api, UdpTracker) yield a panic when trying to start them.
Relevant discussion
Beta Was this translation helpful? Give feedback.
All reactions