Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Application panicked in zenoh code: registry/src/index.crates.io-6f17d22bba15001f/zenoh-transport-0.10.1-rc/src/manager.rs:335:18 #701

Open
vulogov opened this issue Jan 31, 2024 · 12 comments
Labels
bug Something isn't working

Comments

@vulogov
Copy link

vulogov commented Jan 31, 2024

Describe the bug

Application is panicked in zenoh code with

thread '<unnamed>' panicked at 'called `Result::unwrap()` on an `Err` value: Os { code: 35, kind: WouldBlock, message: "Resource temporarily unavailable" }', /Users/gandalf/.cargo/registry/src/index.crates.io-6f17d22bba15001f/zenoh-transport-0.10.1-rc/src/manager.rs:335:18

To reproduce

  1. Zenohd started as
zenohd --adminspace-permissions=rw --cfg='plugins/storage_manager/storages/demo:{key_expr:"zbus/**",volume:"memory"}'
  1. Publisher
fn run_history(content: String, c: cmd::Cli, stream_cmd: cmd::Stream, zc: Config) {
    let stream = Deserializer::from_str(&content).into_iter::<Value>();

    match zenoh::open(zc).res() {
        Ok(session) => {
            for value in stream {
                match value {
                    Ok(zjson) => {
                        if stream_cmd.stdout {
                            match serde_json::to_string_pretty(&zjson) {
                                Ok(val) => {
                                    println!("{}", &val);
                                }
                                Err(err) => {
                                    log::error!("Error converting JSON for stdout: {:?}", err);
                                }
                            }
                        }
                        if stream_cmd.bus {
                            match cmd::zenoh_lib::get_key_from_metadata(c.platform_name.clone(), "*".to_string(), zjson["itemid"].to_string(), &session) {
                                Some(key) => {
                                    match cmd::zbus_export_zabbix::convert_zabbix_export_payload_to_zbus(key, c.platform_name.clone(), zjson.clone()) {
                                        Some(payload) => {
                                            let store_key = match zjson["type"].as_i64() {
                                                Some(2) => format!("log/metric/{}/{}/{}{}", &c.protocol_version, &c.platform_name, &payload["src"].as_str().unwrap(), &payload["key"].as_str().unwrap()),
                                                _ => format!("zbus/metric/{}/{}/{}{}", &c.protocol_version, &c.platform_name, &payload["src"].as_str().unwrap(), &payload["key"].as_str().unwrap()),
                                            };
                                            match session.put(store_key.clone(), payload.clone()).encoding(KnownEncoding::AppJson).res() {
                                                Ok(_) => log::info!("ZBX catcher->ZBUS: {}", &store_key),
                                                Err(err) => log::error!("Error ingesting {} {:?}: {:?}", &payload["key"], &payload, err),
                                            }
                                        }
                                        None => continue,
                                    }
                                }
                                None => continue,
                            }
                        }
                    }
                    Err(err) => log::error!("zabbix streamer catch an error during JSON processing: {:?}", err),
                }
            }
            match session.close().res() {
                Ok(_) => {},
                Err(err) => log::error!("Error closing Zenoh session: {:?}", err),
            }
        }
        Err(err) => {
            log::error!("Error connecting to the bus: {:?}", err);
        }
    }
}

System info

  • Platform: Darwin home.lan 22.6.0 Darwin Kernel Version 22.6.0: Wed Jul 5 22:21:56 PDT 2023; root:xnu-8796.141.3~6/RELEASE_X86_64 x86_64
@vulogov vulogov added the bug Something isn't working label Jan 31, 2024
@Mallets
Copy link
Member

Mallets commented Jan 31, 2024

It looks like the OS returned Resource Temporarily Unavailable (error 35) while spawning new threads for the async executor. Is the issue above always happening on your machine or was it a sporadic failure?
Would you mind providing more info on your system/hardware configuration in addition to the kernel version?
Did you have many other applications running in parallel?

#566 is currently reworking the async runtime and is expected to land relatively soon.

@vulogov
Copy link
Author

vulogov commented Jan 31, 2024 via email

@Mallets
Copy link
Member

Mallets commented Feb 1, 2024

Thanks for the up-to-date information. Response below:

  1. Ok, nothing weird here.

  2. How many threads are you spawning?

  3. Nothing extremely weird going on.

  4. Moving to async/sync will not change the much the end result unless you are spawning too many threads.

Can you please monitor CPU/RAM consumption on that?
From the actual panic message really seems that you are running out of resources in the system...

@vulogov
Copy link
Author

vulogov commented Feb 2, 2024

  1. I've tried to spawn 1,2,3,4,10 http threads and result is the same. There is a single thread on sender that shall consume a single http thread.
  2. Agree
  3. Thank you for clarification

@vulogov
Copy link
Author

vulogov commented Feb 2, 2024

Here is common situation with CPU/Memory and other utilization

Screenshot 2024-02-02 at 13 45 12

@vulogov
Copy link
Author

vulogov commented Feb 2, 2024

I am running my app with extended backtrace. I will have some results in an hour or two.

@vulogov
Copy link
Author

vulogov commented Feb 3, 2024

[2024-02-03T00:18:06Z INFO  zbus::cmd::zbus_export_stream] Request is served by thread #0
[2024-02-03T00:18:06Z INFO  zenoh::net::runtime] Using PID: 706b64c5d3a618ca4ea57962e9558259
[2024-02-03T00:18:06Z INFO  zenoh::net::runtime::orchestrator] Zenoh can be reached at: tcp/[fe80::1]:64017
[2024-02-03T00:18:06Z INFO  zenoh::net::runtime::orchestrator] Zenoh can be reached at: tcp/[fe80::aede:48ff:fe00:1122]:64017
[2024-02-03T00:18:06Z INFO  zenoh::net::runtime::orchestrator] Zenoh can be reached at: tcp/[fe80::1c66:20b3:ef39:9909]:64017
[2024-02-03T00:18:06Z INFO  zenoh::net::runtime::orchestrator] Zenoh can be reached at: tcp/[fe80::f484:9eff:fed9:8232]:64017
[2024-02-03T00:18:06Z INFO  zenoh::net::runtime::orchestrator] Zenoh can be reached at: tcp/[fe80::f484:9eff:fed9:8232]:64017
[2024-02-03T00:18:06Z INFO  zenoh::net::runtime::orchestrator] Zenoh can be reached at: tcp/[fe80::1d93:51a3:319e:1aa5]:64017
[2024-02-03T00:18:06Z INFO  zenoh::net::runtime::orchestrator] Zenoh can be reached at: tcp/[fe80::832e:42c4:32ca:940e]:64017
[2024-02-03T00:18:06Z INFO  zenoh::net::runtime::orchestrator] Zenoh can be reached at: tcp/[fe80::ce81:b1c:bd2c:69e]:64017
[2024-02-03T00:18:06Z INFO  zenoh::net::runtime::orchestrator] Zenoh can be reached at: tcp/[fe80::68ce:1db7:bb03:abc]:64017
[2024-02-03T00:18:06Z INFO  zenoh::net::runtime::orchestrator] Zenoh can be reached at: tcp/[fe80::abaa:328c:3fc6:2b79]:64017
[2024-02-03T00:18:06Z INFO  zenoh::net::runtime::orchestrator] Zenoh can be reached at: tcp/192.168.86.228:64017
[2024-02-03T00:18:06Z INFO  zenoh::net::runtime::orchestrator] zenohd listening scout messages on 224.0.0.224:7446
[2024-02-03T00:18:07Z INFO  zbus::cmd::zbus_export_stream] Initiate delivery to bus
[2024-02-03T00:18:07Z INFO  zbus::cmd::zbus_export_stream] ZBX catcher->ZBUS: zbus/metric/v1/local/Zabbix server/vm.memory.size/total
[2024-02-03T00:18:07Z INFO  zbus::cmd::zbus_export_stream] Closing connection to bus
[2024-02-03T00:18:07Z INFO  zbus::cmd::zbus_export_stream] Request is served by thread #0
[2024-02-03T00:18:07Z INFO  zenoh::net::runtime] Using PID: 706b64c5d3a618ca4ea57962e9558259
thread '<unnamed>' panicked at 'called `Result::unwrap()` on an `Err` value: Os { code: 35, kind: WouldBlock, message: "Resource temporarily unavailable" }', /Users/gandalf/.cargo/registry/src/index.crates.io-6f17d22bba15001f/zenoh-transport-0.10.1-rc/src/manager.rs:335:18
stack backtrace:
   0: _rust_begin_unwind
   1: core::panicking::panic_fmt
   2: core::result::unwrap_failed
   3: core::result::Result<T,E>::unwrap
             at /private/tmp/rust-20230923-7199-1y02uxq/rustc-1.72.1-src/library/core/src/result.rs:1076:23
   4: zenoh_transport::manager::TransportExecutor::new
             at /Users/gandalf/.cargo/registry/src/index.crates.io-6f17d22bba15001f/zenoh-transport-0.10.1-rc/src/manager.rs:332:13
   5: zenoh_transport::manager::TransportManager::new
             at /Users/gandalf/.cargo/registry/src/index.crates.io-6f17d22bba15001f/zenoh-transport-0.10.1-rc/src/manager.rs:383:26
   6: zenoh_transport::manager::TransportManagerBuilder::build
             at /Users/gandalf/.cargo/registry/src/index.crates.io-6f17d22bba15001f/zenoh-transport-0.10.1-rc/src/manager.rs:291:12
   7: zenoh::net::runtime::Runtime::init::{{closure}}
             at /Users/gandalf/.cargo/registry/src/index.crates.io-6f17d22bba15001f/zenoh-0.10.1-rc/src/net/runtime/mod.rs:128:33
   8: zenoh::session::Session::new::{{closure}}
             at /Users/gandalf/.cargo/registry/src/index.crates.io-6f17d22bba15001f/zenoh-0.10.1-rc/src/session.rs:849:41
   9: <async_std::task::builder::SupportTaskLocals<F> as core::future::future::Future>::poll::{{closure}}
             at /Users/gandalf/.cargo/registry/src/index.crates.io-6f17d22bba15001f/async-std-1.12.0/src/task/builder.rs:199:17
  10: async_std::task::task_locals_wrapper::TaskLocalsWrapper::set_current::{{closure}}
             at /Users/gandalf/.cargo/registry/src/index.crates.io-6f17d22bba15001f/async-std-1.12.0/src/task/task_locals_wrapper.rs:60:13
  11: std::thread::local::LocalKey<T>::try_with
             at /private/tmp/rust-20230923-7199-1y02uxq/rustc-1.72.1-src/library/std/src/thread/local.rs:270:16
  12: std::thread::local::LocalKey<T>::with
             at /private/tmp/rust-20230923-7199-1y02uxq/rustc-1.72.1-src/library/std/src/thread/local.rs:246:9
  13: async_std::task::task_locals_wrapper::TaskLocalsWrapper::set_current
             at /Users/gandalf/.cargo/registry/src/index.crates.io-6f17d22bba15001f/async-std-1.12.0/src/task/task_locals_wrapper.rs:55:9
  14: <async_std::task::builder::SupportTaskLocals<F> as core::future::future::Future>::poll
             at /Users/gandalf/.cargo/registry/src/index.crates.io-6f17d22bba15001f/async-std-1.12.0/src/task/builder.rs:197:13
  15: <futures_lite::future::Or<F1,F2> as core::future::future::Future>::poll
             at /Users/gandalf/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-lite-2.2.0/src/future.rs:446:33
  16: async_executor::Executor::run::{{closure}}
             at /Users/gandalf/.cargo/registry/src/index.crates.io-6f17d22bba15001f/async-executor-1.8.0/src/lib.rs:261:32
  17: async_executor::LocalExecutor::run::{{closure}}
             at /Users/gandalf/.cargo/registry/src/index.crates.io-6f17d22bba15001f/async-executor-1.8.0/src/lib.rs:480:34
  18: async_io::driver::block_on::{{closure}}
             at /Users/gandalf/.cargo/registry/src/index.crates.io-6f17d22bba15001f/async-io-2.3.1/src/driver.rs:199:37
  19: std::thread::local::LocalKey<T>::try_with
             at /private/tmp/rust-20230923-7199-1y02uxq/rustc-1.72.1-src/library/std/src/thread/local.rs:270:16
  20: std::thread::local::LocalKey<T>::with
             at /private/tmp/rust-20230923-7199-1y02uxq/rustc-1.72.1-src/library/std/src/thread/local.rs:246:9
  21: async_io::driver::block_on
             at /Users/gandalf/.cargo/registry/src/index.crates.io-6f17d22bba15001f/async-io-2.3.1/src/driver.rs:175:5
  22: async_global_executor::reactor::block_on::{{closure}}
             at /Users/gandalf/.cargo/registry/src/index.crates.io-6f17d22bba15001f/async-global-executor-2.4.1/src/reactor.rs:3:18
  23: async_global_executor::reactor::block_on
             at /Users/gandalf/.cargo/registry/src/index.crates.io-6f17d22bba15001f/async-global-executor-2.4.1/src/reactor.rs:12:5
  24: async_global_executor::executor::block_on::{{closure}}
             at /Users/gandalf/.cargo/registry/src/index.crates.io-6f17d22bba15001f/async-global-executor-2.4.1/src/executor.rs:26:36
  25: std::thread::local::LocalKey<T>::try_with
             at /private/tmp/rust-20230923-7199-1y02uxq/rustc-1.72.1-src/library/std/src/thread/local.rs:270:16
  26: std::thread::local::LocalKey<T>::with
             at /private/tmp/rust-20230923-7199-1y02uxq/rustc-1.72.1-src/library/std/src/thread/local.rs:246:9
  27: async_global_executor::executor::block_on
             at /Users/gandalf/.cargo/registry/src/index.crates.io-6f17d22bba15001f/async-global-executor-2.4.1/src/executor.rs:26:5
  28: async_std::task::builder::Builder::blocking::{{closure}}::{{closure}}
             at /Users/gandalf/.cargo/registry/src/index.crates.io-6f17d22bba15001f/async-std-1.12.0/src/task/builder.rs:171:25
  29: async_std::task::task_locals_wrapper::TaskLocalsWrapper::set_current::{{closure}}
             at /Users/gandalf/.cargo/registry/src/index.crates.io-6f17d22bba15001f/async-std-1.12.0/src/task/task_locals_wrapper.rs:60:13
  30: std::thread::local::LocalKey<T>::try_with
             at /private/tmp/rust-20230923-7199-1y02uxq/rustc-1.72.1-src/library/std/src/thread/local.rs:270:16
  31: std::thread::local::LocalKey<T>::with
             at /private/tmp/rust-20230923-7199-1y02uxq/rustc-1.72.1-src/library/std/src/thread/local.rs:246:9
  32: async_std::task::task_locals_wrapper::TaskLocalsWrapper::set_current
             at /Users/gandalf/.cargo/registry/src/index.crates.io-6f17d22bba15001f/async-std-1.12.0/src/task/task_locals_wrapper.rs:55:9
  33: async_std::task::builder::Builder::blocking::{{closure}}
             at /Users/gandalf/.cargo/registry/src/index.crates.io-6f17d22bba15001f/async-std-1.12.0/src/task/builder.rs:168:17
  34: std::thread::local::LocalKey<T>::try_with
             at /private/tmp/rust-20230923-7199-1y02uxq/rustc-1.72.1-src/library/std/src/thread/local.rs:270:16
  35: std::thread::local::LocalKey<T>::with
             at /private/tmp/rust-20230923-7199-1y02uxq/rustc-1.72.1-src/library/std/src/thread/local.rs:246:9
  36: async_std::task::builder::Builder::blocking
             at /Users/gandalf/.cargo/registry/src/index.crates.io-6f17d22bba15001f/async-std-1.12.0/src/task/builder.rs:161:9
  37: async_std::task::block_on::block_on
             at /Users/gandalf/.cargo/registry/src/index.crates.io-6f17d22bba15001f/async-std-1.12.0/src/task/block_on.rs:33:5
  38: <zenoh_core::ResolveFuture<F,To> as zenoh_core::SyncResolve>::res_sync
             at /Users/gandalf/.cargo/registry/src/index.crates.io-6f17d22bba15001f/zenoh-core-0.10.1-rc/src/lib.rs:166:9
  39: <zenoh::OpenBuilder<TryIntoConfig> as zenoh_core::SyncResolve>::res_sync
             at /Users/gandalf/.cargo/registry/src/index.crates.io-6f17d22bba15001f/zenoh-0.10.1-rc/src/lib.rs:305:9
  40: zenoh_core::SyncResolve::res
             at /Users/gandalf/.cargo/registry/src/index.crates.io-6f17d22bba15001f/zenoh-core-0.10.1-rc/src/lib.rs:57:9
  41: zbus::cmd::zbus_export_stream::run_history
             at ./src/cmd/zbus_export_stream.rs:28:11
  42: zbus::cmd::zbus_export_stream::run::{{closure}}
             at ./src/cmd/zbus_export_stream.rs:107:57
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
[2024-02-03T00:18:07Z ERROR zbus::cmd::zbus_export_stream] Zabbix catcher error in joining the thread: Any { .. }

@vulogov
Copy link
Author

vulogov commented Feb 3, 2024

Each pre-spaned thread performing this:

  1. Establish connection with Zenoh
  2. Parse received JSON's
  3. Publish succesfully parsed JSON
  4. Close connection with Zenoh

If I am to exhaust any limit, wanna know which one. And zbus_export_stream.rs:28 is at zenoh.open().res()

[2024-02-03T00:18:06Z INFO  zbus::cmd::zbus_export_stream] Request is served by thread #0
[2024-02-03T00:18:06Z INFO  zenoh::net::runtime] Using PID: 706b64c5d3a618ca4ea57962e9558259
[2024-02-03T00:18:06Z INFO  zenoh::net::runtime::orchestrator] Zenoh can be reached at: tcp/[fe80::1]:64017
[2024-02-03T00:18:06Z INFO  zenoh::net::runtime::orchestrator] Zenoh can be reached at: tcp/[fe80::aede:48ff:fe00:1122]:64017
[2024-02-03T00:18:06Z INFO  zenoh::net::runtime::orchestrator] Zenoh can be reached at: tcp/[fe80::1c66:20b3:ef39:9909]:64017
[2024-02-03T00:18:06Z INFO  zenoh::net::runtime::orchestrator] Zenoh can be reached at: tcp/[fe80::f484:9eff:fed9:8232]:64017
[2024-02-03T00:18:06Z INFO  zenoh::net::runtime::orchestrator] Zenoh can be reached at: tcp/[fe80::f484:9eff:fed9:8232]:64017
[2024-02-03T00:18:06Z INFO  zenoh::net::runtime::orchestrator] Zenoh can be reached at: tcp/[fe80::1d93:51a3:319e:1aa5]:64017
[2024-02-03T00:18:06Z INFO  zenoh::net::runtime::orchestrator] Zenoh can be reached at: tcp/[fe80::832e:42c4:32ca:940e]:64017
[2024-02-03T00:18:06Z INFO  zenoh::net::runtime::orchestrator] Zenoh can be reached at: tcp/[fe80::ce81:b1c:bd2c:69e]:64017
[2024-02-03T00:18:06Z INFO  zenoh::net::runtime::orchestrator] Zenoh can be reached at: tcp/[fe80::68ce:1db7:bb03:abc]:64017
[2024-02-03T00:18:06Z INFO  zenoh::net::runtime::orchestrator] Zenoh can be reached at: tcp/[fe80::abaa:328c:3fc6:2b79]:64017
[2024-02-03T00:18:06Z INFO  zenoh::net::runtime::orchestrator] Zenoh can be reached at: tcp/192.168.86.228:64017
[2024-02-03T00:18:06Z INFO  zenoh::net::runtime::orchestrator] zenohd listening scout messages on 224.0.0.224:7446
[2024-02-03T00:18:07Z INFO  zbus::cmd::zbus_export_stream] Initiate delivery to bus
[2024-02-03T00:18:07Z INFO  zbus::cmd::zbus_export_stream] ZBX catcher->ZBUS: zbus/metric/v1/local/Zabbix server/vm.memory.size/total
[2024-02-03T00:18:07Z INFO  zbus::cmd::zbus_export_stream] Closing connection to bus

@Mallets
Copy link
Member

Mallets commented Feb 5, 2024

I am running my app with extended backtrace. I will have some results in an hour or two.

Based on the screenshot it seems your system is overloaded. I can see ~10000 and only 35MB of free RAM out of 32GB.

@vulogov
Copy link
Author

vulogov commented Feb 7, 2024

Sure, I exhaust the RAM, but I do have swapins and swapouts going. Even if that, shall app crash in library code if something like this occurs ?

@Mallets
Copy link
Member

Mallets commented Feb 7, 2024

Error management could always be improved. But in this particular case Zenoh needs to spawn some threads in order to communicate on the network.

Assuming the error is somehow bubbled up, how would your application react? I'm trying to understand the potential usage pattern here...

@vulogov
Copy link
Author

vulogov commented Feb 7, 2024

One of my points, while I do exhaust the RAM, I do not exhaust VM, so as far as I concern, I do have resources. Although I will eventually get a more in-depth understanding of the mechanics of the Zenoh library, my general expectation is that if any problem happens inside the library call, the one shall return either a result or an error, which I shall handle. And if unwrap() in the library throws unhandled exception, there is not something I can do on my side of the code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants