Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash at termination with offloading #413

Open
PlagueCZ opened this issue Oct 23, 2023 · 6 comments
Open

Crash at termination with offloading #413

PlagueCZ opened this issue Oct 23, 2023 · 6 comments
Labels
bug Something isn't working

Comments

@PlagueCZ
Copy link
Contributor

PlagueCZ commented Oct 23, 2023

Describe the bug
Doing a minimal startup and then terminating dpservice-bin via Ctrl+C leads to a SIGSEGV on termination (in DPDK cleanup).

To Reproduce
Running on a PC with Mellanox Connectx-6 with two VMs running (using a vfio NIC).
dpservice-bin -l0,1 -- --no-stats
dpservice-cli init
dpservice-cli add interface --id test10 --device 0000:03:00.0_representor_vf0 --vni 123 --ipv4 192.168.123.10 --ipv6 fe80::10
Ctrl+C on dpservice

Stacktrace

Thread 1 "dpservice-bin" received signal SIGSEGV, Segmentation fault.
0x00007ffff5e8bab1 in flow_dv_sample_clone_free_cb ()
   from /usr/local/lib/x86_64-linux-gnu/dpdk/pmds-23.0/librte_net_mlx5.so.23.0

(this is part of rte_eal_cleanup())

Additional information
This does not happen once --no-offload is added (given the stacktrace that's expected).

@PlagueCZ PlagueCZ added the bug Something isn't working label Oct 23, 2023
@PlagueCZ
Copy link
Contributor Author

@byteocean not sure you can test this. I will hopefully get a separated lab setup soon to test on another machine.

@byteocean
Copy link
Contributor

byteocean commented Oct 24, 2023

not encounter this so far. has something to do with migrating to DPDK 22? which was not tested on my side yet.

@PlagueCZ
Copy link
Contributor Author

@byteocean thanks for pointing me the right direction!
The change that causes this is 61cf7a0 (Which removed the DPDK warnings at the end about stopping ports).
So I guess there needs to be something more done before stopping ports, something about the flows I guess?

@byteocean
Copy link
Contributor

byteocean commented Oct 24, 2023

maybe the handler thing that wraps the age action. you could try to remove this indirect action part to see if the error still exits. btw, I rebased to main and upgrade dpdk to 22.11.3, and also got this error when I terminated it, after running some ping tests.

@byteocean
Copy link
Contributor

@PlagueCZ fyi, under 22.03, such error (seg fault) does not appear.

@PlagueCZ
Copy link
Contributor Author

@byteocean removing call to dp_install_default_rule_in_monitoring_group() fixes the error.
I have tested 21.11 right before updating to 22.11 and the error does occur there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants