-
Notifications
You must be signed in to change notification settings - Fork 707
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG]: vsomeip slow to establish communication with lots of EventGroup #669
Comments
I've opened draft pull requests:
with the code-changes that I've applied locally to address this issue. I would appreciate any feedback on the approach. |
I've updated the pull request for 3.4.x (but not 3.1.x) with an additional commit for a problem discovered in testing. I was getting this warning:
and the root-cause was here: vsomeip/implementation/endpoints/src/udp_server_endpoint_impl.cpp Lines 682 to 690 in 6c0e9db
on_message_received supports multiple messages in a single UDP frame but only processes the message:
After changing the train logic to aggregate multiple SOMEIP-SD messages into a single UDP frame we want it to process all messages found in the frame, no matter if the messages are SOMEIP or SOMEIP-SD |
hi @joeyoravec i have been trying to reproduce your problem on my environment, so that we could validate the fix, however I am having some problems. Can you check if these make sense? our provide the ones you used so that i could check it. Thanks! |
vSomeip Version
v3.4.10
Boost Version
1.82
Environment
Android and QNX
Describe the bug
My automotive system has
*.fidl
with ~3500 attributes, one per CAN signal. My*.fdepl
maps each attribute into a unique EventGroup.Any time the network connection is established, or broken and re-established, I get an avalanche of ~3500 subscribes, followed by ~3500 acknowledgements, transmitted one-per-frame. The entire sequence does not fit inside a 2 seconds Service Discovery interval. When the work does not complete within the timeout interval then routingmanager will issue StopSubscribe and SubscribeNAK. The system will retry but it will take a long time, at least a couple of Service Discovery intervals.
The train logic is supposed to aggregate these together, sending a train only when it’s full or 5 ms elapse, but there are several places in the code that prevent this.
Reproduction Steps
This behavior is easily reproduced when the system has a
*.fidl
with 1000s of attributes and*.fdepl
puts each into a unique EventGroup.Subscribe to all ~3500 attributes, use an
ifconfig down; sleep 10; ifconfig up
to break and re-establish the network connection, look at the tcpdump and observe the network behavior.Expected behaviour
The train logic should do a "pretty good job" to aggregate many SUBSCRIBE and many SUBSCRIBEACK into each Service Discovery packet.
Logs and Screenshots
With the existing code you should see 1000s of back-to-back SUBSCRIBE like:
each of ~98 bytes, separate packets, nothing or almost-nothing aggregated. In this region we see a SUBSCRIBENACK and socket close because the entire sequence exceeded the 2s Service Discovery timeout interval
The text was updated successfully, but these errors were encountered: