-
Notifications
You must be signed in to change notification settings - Fork 83
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug] Zephyr pub/sub example not communicating #156
Comments
TL;DR: your device is getting out of memory, as the value you have configured to CONFIG_HEAP_MEM_POOL_SIZE is too small. The current version of Zenoh protocol needs to be extended with additional capability negotiations during the session establishment to adapt the communication according to each other capabilities. Several improvements in this respect will come with an improved version of the protocol (expected to Q2 2023 according to the public roadmap). This will be especially critical to address the resource constrained capabilities of the microcontrollers. Until then, there are a couple of things you can do as a workaround to your issue.
/**
* Defaulf maximum batch size possible to be received.
*/
#ifndef Z_BATCH_SIZE_RX
#define Z_BATCH_SIZE_RX \
65535 // Warning: changing this value can break the communication
// with zenohd in the current protocol version.
// In the future, it will be possible to negotiate such value.
// Change it at your own risk.
#endif
/**
* Defaulf maximum batch size possible to be sent.
*/
#ifndef Z_BATCH_SIZE_TX
#define Z_BATCH_SIZE_TX 65535
#endif
/**
* Defaulf maximum size for fragmented messages.
*/
#ifndef Z_FRAG_MAX_SIZE
#define Z_FRAG_MAX_SIZE 300000
#endif Let us know if these workarounds were able to solve your problem. Related to #151 . |
Hello, we are also running Zenoh via Zephyr and have reduced the values Z_BATCH_SIZE_RX and Z_BATCH_SIZE_TX to 1024 bytes. |
That doesn't seem to fix the problem. I changed I then tried to keep one F429zi in scouting mode and make the other one a publisher in peer mode. My expectation what that the scout would see the publisher as a peer but it didn't. The scout still only sees the router. |
@eeas-joas did it work for you? @dcorbeil see replies inline below:
From a first look (no testing), it seems that you would not have issues with the configured values.
I do not remember having this issue. I need to investigate it a bit with a similar board that we have.
Zenoh-Pico, as of today, fully supports
This is as expected. Zenoh-Pico aims to work mainly as a |
@cguimaraes Hello, we are also running Zenoh via Zephyr and have reduced the values Z_BATCH_SIZE_RX and Z_BATCH_SIZE_TX to 1024 bytes and it works with this config. |
Have you had time to give this a try? I tried to keep the
Thank you for those clarifications. Ok I got something that works, kinda. In my current subnet setup I have:
Note: I ended up turning on debug prints (
As you can see, the publisher ends up terminating the session before it even starts sending data. Why is that happening? Also, just for fun, I commented out the Thank you for your help. |
I found the problem. Turns out that I still need to do further testing but it seems like my previous hack of fiddling with the socket timeout in zephyr is not necessary. I suspect that since |
In the past, we had issues on docker when handling multicast traffic.
Indeed, it seems to be the issue. Thanks @dcorbeil for also trying to understand the root cause. This can be easily fixed on Zenoh-Pico. Also, it seems clear that we need:
You can expect it in the following days. |
As I wrote in the corresponding PR, we would be happy to receive your contribution in any case. |
#159 is tackling point |
) * z_sleep_ms use zephyr's nanosleep because the previously used usleep doesn't support sleeps for more than 1 second * Improve sleep resilience considering only their granularity * Add native zephyr sleep api instead of its posix compatible * Minimize oversleeping * Missing the fix on esp32 * Where the start gone? --------- Co-authored-by: David Corbeil <[email protected]>
Describe the bug
Hi all,
I am trying to get the pub/sub example to work on the Nucleo-F429zi. I already have a Zephyr Cmake project that I integrated zenoh-pico into and got it to build without major issues. The problem comes when I try to get my two F429zi to communicate through the pub and sub example. I use the
prj.conf
example fromdocs/zephyr/nucleo_f767zi
since the two boards are pretty similar to each other. Although there are some configs that are turned off by zephyr due to dependencies not being met (might or might not be related to my issue). That applies to thenucleo_f767zi
as well so I commented outETH_NATIVE_POSIX
,ETH_NATIVE_POSIX_RANDOM_MAC
andNET_SOCKETS_POSIX_NAMES
. I then set my two devices as peers by setting#define CLIENT_OR_PEER 1
. I tried connecting my two boards either directly to each other or both of them in a network switch. In both these cases they can ping each other so I know that at least there is some communication working. Then when I start the publisher and subscriber, both are declared subscriber and publisher correctly and the publisher is publishing its messages but the subscriber doesn't receive anything.This is my first time playing with zenoh so I might be missing some important detail. Are there tips and tricks to help debug what's happening? Or rather not happening.
I also tried one F429zi in client mode connected directly on my host computer with a router running but had a different problem. The router was built from zenoh's latest commit the day of writing this. The client F429 wasn't able to create a zenoh session.
Here is my
prj.conf
The only difference in the
prj.conf
between my two boards is the ipv4 address.To reproduce
examples/zephyr/z_sub.c
examples/zephyr/z_pub.c
System info
The text was updated successfully, but these errors were encountered: