Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Test Failure] rclcpp_action / test_client #326

Open
YuanYuYuan opened this issue Dec 3, 2024 · 6 comments
Open

[Test Failure] rclcpp_action / test_client #326

YuanYuYuan opened this issue Dec 3, 2024 · 6 comments
Labels
bug Something isn't working

Comments

@YuanYuYuan
Copy link
Contributor

YuanYuYuan commented Dec 3, 2024

Info

To reproduce

  1. Build the rclcpp_action
  2. Run the specific command
/workspace/build/rclcpp_action/test_client --gtest_filter="TestClientAgainstServer.*"

Backtrace

#0  0x00007f8c871ca5c4 in std::__shared_ptr<rmw_context_impl_s::Data, (__gnu_cxx::_Lock_policy)2>::get (this=0x2f6e6f697463615f)
    at /usr/include/c++/13/bits/shared_ptr_base.h:1666
#1  0x00007f8c871cb3b8 in std::__shared_ptr_access<rmw_context_impl_s::Data, (__gnu_cxx::_Lock_policy)2, false, false>::_M_get (this=0x2f6e6f697463615f)
    at /usr/include/c++/13/bits/shared_ptr_base.h:1363
#2  0x00007f8c871ca4fe in std::__shared_ptr_access<rmw_context_impl_s::Data, (__gnu_cxx::_Lock_policy)2, false, false>::operator-> (this=0x2f6e6f697463615f)
    at /usr/include/c++/13/bits/shared_ptr_base.h:1357
#3  0x00007f8c871c776d in rmw_context_impl_s::get_node_data (this=0x2f6e6f697463615f, node=0x5646b9d54300)
    at /workspace/src/rmw_zenoh/rmw_zenoh_cpp/src/detail/rmw_context_impl_s.cpp:538
#4  0x00007f8c871c2576 in rmw_zenoh_cpp::ClientData::decrement_in_flight_and_conditionally_remove (this=0x5646b9d4eae0)
    at /workspace/src/rmw_zenoh/rmw_zenoh_cpp/src/detail/rmw_client_data.cpp:555
#5  0x00007f8c871c004f in (anonymous namespace)::client_data_drop (data=0x5646b9d4eae0)
    at /workspace/src/rmw_zenoh/rmw_zenoh_cpp/src/detail/rmw_client_data.cpp:101
#6  0x00007f8c86cbe0c0 in alloc::sync::Arc<T,A>::drop_slow () from /workspace/install/zenoh_c_vendor/opt/zenoh_c_vendor/lib/libzenohc.so
#7  0x00007f8c86a6fe5c in <zenoh::api::session::WeakSession as zenoh::net::primitives::Primitives>::send_response_final ()
   from /workspace/install/zenoh_c_vendor/opt/zenoh_c_vendor/lib/libzenohc.so
#8  0x00007f8c86725f83 in alloc::sync::Arc<T,A>::drop_slow () from /workspace/install/zenoh_c_vendor/opt/zenoh_c_vendor/lib/libzenohc.so
#9  0x00007f8c8671e599 in core::ptr::drop_in_place<zenoh::api::queryable::Query> () from /workspace/install/zenoh_c_vendor/opt/zenoh_c_vendor/lib/libzenohc.so
#10 0x00007f8c8680a0d5 in z_query_drop () from /workspace/install/zenoh_c_vendor/opt/zenoh_c_vendor/lib/libzenohc.so
#11 0x00007f8c871fc641 in z_drop (this_=0x5646b9f2d360) at /workspace/install/zenoh_c_vendor/opt/zenoh_c_vendor/include/zenoh_macros.h:647
#12 0x00007f8c871fc48c in rmw_zenoh_cpp::ZenohQuery::~ZenohQuery (this=0x5646b9f2d360, __in_chrg=<optimized out>)
    at /workspace/src/rmw_zenoh/rmw_zenoh_cpp/src/detail/zenoh_utils.cpp:70
#13 0x00007f8c871e790a in std::default_delete<rmw_zenoh_cpp::ZenohQuery>::operator() (this=0x5646b9e9e6d0, __ptr=0x5646b9f2d360)
    at /usr/include/c++/13/bits/unique_ptr.h:99
#14 0x00007f8c871e6b90 in std::unique_ptr<rmw_zenoh_cpp::ZenohQuery, std::default_delete<rmw_zenoh_cpp::ZenohQuery> >::~unique_ptr (this=0x5646b9e9e6d0,
    __in_chrg=<optimized out>) at /usr/include/c++/13/bits/unique_ptr.h:404
#15 0x00007f8c871ed16b in std::_Destroy<std::unique_ptr<rmw_zenoh_cpp::ZenohQuery, std::default_delete<rmw_zenoh_cpp::ZenohQuery> > > (
    __pointer=0x5646b9e9e6d0) at /usr/include/c++/13/bits/stl_construct.h:151
#16 0x00007f8c871ecb9e in std::_Destroy_aux<false>::__destroy<std::unique_ptr<rmw_zenoh_cpp::ZenohQuery, std::default_delete<rmw_zenoh_cpp::ZenohQuery> >*> (
    __first=0x5646b9e9e6d0, __last=0x5646b9e9e6d8) at /usr/include/c++/13/bits/stl_construct.h:163
#17 0x00007f8c871ebd78 in std::_Destroy<std::unique_ptr<rmw_zenoh_cpp::ZenohQuery, std::default_delete<rmw_zenoh_cpp::ZenohQuery> >*> (__first=0x5646b9e9e6d0,
    __last=0x5646b9e9e6d8) at /usr/include/c++/13/bits/stl_construct.h:196
#18 0x00007f8c871e90e5 in std::_Destroy<std::unique_ptr<rmw_zenoh_cpp::ZenohQuery, std::default_delete<rmw_zenoh_cpp::ZenohQuery> >*, std::unique_ptr<rmw_zenoh_cpp::ZenohQuery, std::default_delete<rmw_zenoh_cpp::ZenohQuery> > > (__last=0x5646b9e9e6d8, __first=0x5646b9e9e6d0)
    at /usr/include/c++/13/bits/alloc_traits.h:948
#19 std::deque<std::unique_ptr<rmw_zenoh_cpp::ZenohQuery, std::default_delete<rmw_zenoh_cpp::ZenohQuery> >, std::allocator<std::unique_ptr<rmw_zenoh_cpp::ZenohQuery, std::default_delete<rmw_zenoh_cpp::ZenohQuery> > > >::_M_destroy_data_aux (this=0x5646b9ec4248,
    __first=std::unique_ptr<rmw_zenoh_cpp::ZenohQuery> = {...}, __last=std::unique_ptr<rmw_zenoh_cpp::ZenohQuery> = {...})
    at /usr/include/c++/13/bits/deque.tcc:875
#20 0x00007f8c871e7f6e in std::deque<std::unique_ptr<rmw_zenoh_cpp::ZenohQuery, std::default_delete<rmw_zenoh_cpp::ZenohQuery> >, std::allocator<std::unique_ptr<rmw_zenoh_cpp::ZenohQuery, std::default_delete<rmw_zenoh_cpp::ZenohQuery> > > >::_M_destroy_data (this=0x5646b9ec4248,
    __first=std::unique_ptr<rmw_zenoh_cpp::ZenohQuery> = {...}, __last=std::unique_ptr<rmw_zenoh_cpp::ZenohQuery> = {...})
--Type <RET> for more, q to quit, c to continue without paging--
    at /usr/include/c++/13/bits/stl_deque.h:2091
#21 0x00007f8c871e70f4 in std::deque<std::unique_ptr<rmw_zenoh_cpp::ZenohQuery, std::default_delete<rmw_zenoh_cpp::ZenohQuery> >, std::allocator<std::unique_ptr<rmw_zenoh_cpp::ZenohQuery, std::default_delete<rmw_zenoh_cpp::ZenohQuery> > > >::~deque (this=0x5646b9ec4248, __in_chrg=<optimized out>)
    at /usr/include/c++/13/bits/stl_deque.h:1028
#22 0x00007f8c871e5e92 in rmw_zenoh_cpp::ServiceData::~ServiceData (this=0x5646b9ec4190, __in_chrg=<optimized out>)
    at /workspace/src/rmw_zenoh/rmw_zenoh_cpp/src/detail/rmw_service_data.cpp:458
#23 0x00007f8c871ed906 in std::_Sp_counted_ptr<rmw_zenoh_cpp::ServiceData*, (__gnu_cxx::_Lock_policy)2>::_M_dispose (this=0x5646b9e4a1f0)
    at /usr/include/c++/13/bits/shared_ptr_base.h:428
#24 0x00007f8c8718332b in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release (this=0x5646b9e4a1f0) at /usr/include/c++/13/bits/shared_ptr_base.h:346
#25 0x00007f8c87184ab7 in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count (this=0x5646b9e04938, __in_chrg=<optimized out>)
    at /usr/include/c++/13/bits/shared_ptr_base.h:1071
#26 0x00007f8c871da014 in std::__shared_ptr<rmw_zenoh_cpp::ServiceData, (__gnu_cxx::_Lock_policy)2>::~__shared_ptr (this=0x5646b9e04930,
    __in_chrg=<optimized out>) at /usr/include/c++/13/bits/shared_ptr_base.h:1524
#27 0x00007f8c871da034 in std::shared_ptr<rmw_zenoh_cpp::ServiceData>::~shared_ptr (this=0x5646b9e04930, __in_chrg=<optimized out>)
    at /usr/include/c++/13/bits/shared_ptr.h:175
#28 0x00007f8c871e1e2c in std::pair<rmw_service_s const* const, std::shared_ptr<rmw_zenoh_cpp::ServiceData> >::~pair (this=0x5646b9e04928,
    __in_chrg=<optimized out>) at /usr/include/c++/13/bits/stl_pair.h:187
#29 0x00007f8c871de2eb in std::__new_allocator<std::__detail::_Hash_node<std::pair<rmw_service_s const* const, std::shared_ptr<rmw_zenoh_cpp::ServiceData> >, false> >::destroy<std::pair<rmw_service_s const* const, std::shared_ptr<rmw_zenoh_cpp::ServiceData> > > (__p=0x5646b9e04928, this=0x5646b9de7688)
    at /usr/include/c++/13/bits/new_allocator.h:198
#30 std::allocator_traits<std::allocator<std::__detail::_Hash_node<std::pair<rmw_service_s const* const, std::shared_ptr<rmw_zenoh_cpp::ServiceData> >, false> > >::destroy<std::pair<rmw_service_s const* const, std::shared_ptr<rmw_zenoh_cpp::ServiceData> > > (__p=0x5646b9e04928, __a=...)
    at /usr/include/c++/13/bits/alloc_traits.h:558
#31 std::__detail::_Hashtable_alloc<std::allocator<std::__detail::_Hash_node<std::pair<rmw_service_s const* const, std::shared_ptr<rmw_zenoh_cpp::ServiceData> >, false> > >::_M_deallocate_node (this=0x5646b9de7688, __n=0x5646b9e04920) at /usr/include/c++/13/bits/hashtable_policy.h:2011
#32 0x00007f8c871dfc9f in std::_Hashtable<rmw_service_s const*, std::pair<rmw_service_s const* const, std::shared_ptr<rmw_zenoh_cpp::ServiceData> >, std::allocator<std::pair<rmw_service_s const* const, std::shared_ptr<rmw_zenoh_cpp::ServiceData> > >, std::__detail::_Select1st, std::equal_to<rmw_service_s const*>, std::hash<rmw_service_s const*>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<false, false, true> >::_M_erase (this=0x5646b9de7688, __bkt=4, __prev_n=0x5646b9e97280, __n=0x5646b9e04920)
    at /usr/include/c++/13/bits/hashtable.h:2353
#33 0x00007f8c871dd63a in std::_Hashtable<rmw_service_s const*, std::pair<rmw_service_s const* const, std::shared_ptr<rmw_zenoh_cpp::ServiceData> >, std::allocator<std::pair<rmw_service_s const* const, std::shared_ptr<rmw_zenoh_cpp::ServiceData> > >, std::__detail::_Select1st, std::equal_to<rmw_service_s const*>, std::hash<rmw_service_s const*>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<false, false, true> >::_M_erase (this=0x5646b9de7688, __k=@0x7ffc8e289c60: 0x5646b9e44b10) at /usr/include/c++/13/bits/hashtable.h:2396
#34 0x00007f8c871db785 in std::_Hashtable<rmw_service_s const*, std::pair<rmw_service_s const* const, std::shared_ptr<rmw_zenoh_cpp::ServiceData> >, std::allocator<std::pair<rmw_service_s const* const, std::shared_ptr<rmw_zenoh_cpp::ServiceData> > >, std::__detail::_Select1st, std::equal_to<rmw_service_s const*>, std::hash<rmw_service_s const*>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<false, false, true> >::erase (this=0x5646b9de7688, __k=@0x7ffc8e289c60: 0x5646b9e44b10) at /usr/include/c++/13/bits/hashtable.h:984
#35 0x00007f8c871da95b in std::unordered_map<rmw_service_s const*, std::shared_ptr<rmw_zenoh_cpp::ServiceData>, std::hash<rmw_service_s const*>, std::equal_to<rmw_service_s const*>, std::allocator<std::pair<rmw_service_s const* const, std::shared_ptr<rmw_zenoh_cpp::ServiceData> > > >::erase (this=0x5646b9de7688,
    __x=@0x7ffc8e289c60: 0x5646b9e44b10) at /usr/include/c++/13/bits/unordered_map.h:770
--Type <RET> for more, q to quit, c to continue without paging--
#36 0x00007f8c871d91f3 in rmw_zenoh_cpp::NodeData::delete_service_data (this=0x5646b9de75b0, service=0x5646b9e44b10)
    at /workspace/src/rmw_zenoh/rmw_zenoh_cpp/src/detail/rmw_node_data.cpp:313
#37 0x00007f8c8720b725 in rmw_destroy_service (node=0x5646b9b9c000, service=0x5646b9e44b10) at /workspace/src/rmw_zenoh/rmw_zenoh_cpp/src/rmw_zenoh.cpp:1727
#38 0x00007f8c88bd7d92 in rcl_service_fini (service=0x5646b9e0cd90, node=0x5646b9cf1040) at /workspace/src/rcl/rcl/src/rcl/service.c:261
#39 0x00005646ab886cd9 in rclcpp::Service<test_msgs::action::Fibonacci_GetResult>::Service(std::shared_ptr<rcl_node_s>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, rclcpp::AnyServiceCallback<test_msgs::action::Fibonacci_GetResult>, rcl_service_options_s&)::{lambda(rcl_service_s*)#1}::operator()(rcl_service_s*) const (__closure=0x5646b9ea85f0, service=0x5646b9e0cd90) at /workspace/install/rclcpp/include/rclcpp/rclcpp/service.hpp:324
#40 0x00005646ab89bd24 in std::_Sp_counted_deleter<rcl_service_s*, rclcpp::Service<test_msgs::action::Fibonacci_GetResult>::Service(std::shared_ptr<rcl_node_s>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, rclcpp::AnyServiceCallback<test_msgs::action::Fibonacci_GetResult>, rcl_service_options_s&)::{lambda(rcl_service_s*)#1}, std::allocator<void>, (__gnu_cxx::_Lock_policy)2>::_M_dispose() (this=0x5646b9ea85e0)
    at /usr/include/c++/13/bits/shared_ptr_base.h:527
#41 0x00005646ab823b31 in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release (this=0x5646b9ea85e0) at /usr/include/c++/13/bits/shared_ptr_base.h:346
#42 0x00005646ab82c74d in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count (this=0x5646b9d50748, __in_chrg=<optimized out>)
    at /usr/include/c++/13/bits/shared_ptr_base.h:1071
#43 0x00005646ab826270 in std::__shared_ptr<rcl_service_s, (__gnu_cxx::_Lock_policy)2>::~__shared_ptr (this=0x5646b9d50740, __in_chrg=<optimized out>)
    at /usr/include/c++/13/bits/shared_ptr_base.h:1524
#44 0x00005646ab826290 in std::shared_ptr<rcl_service_s>::~shared_ptr (this=0x5646b9d50740, __in_chrg=<optimized out>)
    at /usr/include/c++/13/bits/shared_ptr.h:175
#45 0x00005646ab8262d2 in rclcpp::ServiceBase::~ServiceBase (this=0x5646b9d506e0, __in_chrg=<optimized out>)
    at /workspace/install/rclcpp/include/rclcpp/rclcpp/service.hpp:60
#46 0x00005646ab8a2d0a in rclcpp::Service<test_msgs::action::Fibonacci_GetResult>::~Service (this=0x5646b9d506e0, __in_chrg=<optimized out>)
    at /workspace/install/rclcpp/include/rclcpp/rclcpp/service.hpp:440
#47 0x00005646ab8a9212 in std::_Destroy<rclcpp::Service<test_msgs::action::Fibonacci_GetResult> > (__pointer=0x5646b9d506e0)
    at /usr/include/c++/13/bits/stl_construct.h:151
#48 0x00005646ab89ccba in std::allocator_traits<std::allocator<void> >::destroy<rclcpp::Service<test_msgs::action::Fibonacci_GetResult> > (__p=0x5646b9d506e0)
    at /usr/include/c++/13/bits/alloc_traits.h:675
#49 std::_Sp_counted_ptr_inplace<rclcpp::Service<test_msgs::action::Fibonacci_GetResult>, std::allocator<void>, (__gnu_cxx::_Lock_policy)2>::_M_dispose (
    this=0x5646b9d506d0) at /usr/include/c++/13/bits/shared_ptr_base.h:613
#50 0x00005646ab836efb in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release_last_use (this=0x5646b9d506d0)
    at /usr/include/c++/13/bits/shared_ptr_base.h:175
#51 0x00005646ab82c71e in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release_last_use_cold (this=0x5646b9d506d0)
    at /usr/include/c++/13/bits/shared_ptr_base.h:199
#52 0x00005646ab823bcf in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release (this=0x5646b9d506d0) at /usr/include/c++/13/bits/shared_ptr_base.h:353
#53 0x00005646ab82c74d in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count (this=0x7ffc8e28c978, __in_chrg=<optimized out>)
    at /usr/include/c++/13/bits/shared_ptr_base.h:1071
#54 0x00005646ab828a7e in std::__shared_ptr<rclcpp::Service<test_msgs::action::Fibonacci_GetResult>, (__gnu_cxx::_Lock_policy)2>::~__shared_ptr (
    this=0x7ffc8e28c970, __in_chrg=<optimized out>) at /usr/include/c++/13/bits/shared_ptr_base.h:1524
#55 0x00005646ab83179a in std::__shared_ptr<rclcpp::Service<test_msgs::action::Fibonacci_GetResult>, (__gnu_cxx::_Lock_policy)2>::reset (this=0x5646b9c7e448)
    at /usr/include/c++/13/bits/shared_ptr_base.h:1642
#56 0x00005646ab82b07c in TestClient::TearDownServer (this=0x5646b9c7d830) at /workspace/src/rclcpp/rclcpp_action/test/test_client.cpp:239
#57 0x00005646ab82b11c in TestClientAgainstServer::TearDown (this=0x5646b9c7d830) at /workspace/src/rclcpp/rclcpp_action/test/test_client.cpp:296
--Type <RET> for more, q to quit, c to continue without paging--
#58 0x00005646ab8f6c6f in testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void> (object=0x5646b9c7d830,
    method=&virtual testing::Test::TearDown(), location=0x5646ab91a249 "TearDown()") at /opt/ros/rolling/src/gtest_vendor/./src/gtest.cc:2612
#59 0x00005646ab8eecfb in testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void> (object=0x5646b9c7d830,
    method=&virtual testing::Test::TearDown(), location=0x5646ab91a249 "TearDown()") at /opt/ros/rolling/src/gtest_vendor/./src/gtest.cc:2648
#60 0x00005646ab8c6cf4 in testing::Test::Run (this=0x5646b9c7d830) at /opt/ros/rolling/src/gtest_vendor/./src/gtest.cc:2695
#61 0x00005646ab8c7873 in testing::TestInfo::Run (this=0x5646b9abeba0) at /opt/ros/rolling/src/gtest_vendor/./src/gtest.cc:2836
#62 0x00005646ab8c82ec in testing::TestSuite::Run (this=0x5646b9abcf40) at /opt/ros/rolling/src/gtest_vendor/./src/gtest.cc:3015
#63 0x00005646ab8d9470 in testing::internal::UnitTestImpl::RunAllTests (this=0x5646b9abbd90) at /opt/ros/rolling/src/gtest_vendor/./src/gtest.cc:5920
#64 0x00005646ab8f7de6 in testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool> (object=0x5646b9abbd90,
    method=(bool (testing::internal::UnitTestImpl::*)(testing::internal::UnitTestImpl * const)) 0x5646ab8d903c <testing::internal::UnitTestImpl::RunAllTests()>, location=0x5646ab91ac88 "auxiliary test code (environments or event listeners)") at /opt/ros/rolling/src/gtest_vendor/./src/gtest.cc:2612
#65 0x00005646ab8f00f7 in testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool> (object=0x5646b9abbd90,
    method=(bool (testing::internal::UnitTestImpl::*)(testing::internal::UnitTestImpl * const)) 0x5646ab8d903c <testing::internal::UnitTestImpl::RunAllTests()>, location=0x5646ab91ac88 "auxiliary test code (environments or event listeners)") at /opt/ros/rolling/src/gtest_vendor/./src/gtest.cc:2648
#66 0x00005646ab8d78a3 in testing::UnitTest::Run (this=0x5646ab996a40 <testing::UnitTest::GetInstance()::instance>)
    at /opt/ros/rolling/src/gtest_vendor/./src/gtest.cc:5484
#67 0x00005646ab8b209e in RUN_ALL_TESTS () at /opt/ros/rolling/src/gtest_vendor/include/gtest/gtest.h:2317
#68 0x00005646ab8b2086 in main (argc=4, argv=0x7ffc8e28cf58) at /opt/ros/rolling/src/gtest_vendor/src/gtest_main.cc:64

Identified issue

rmw_node * suddenly becomes invalid hence get_node_daata failed to complete.

@YuanYuYuan
Copy link
Contributor Author

Fixed by #336.

@clalancette
Copy link
Collaborator

This is still flaky, so reopening.

@Yadunund
Copy link
Member

Closing as i'm unable to reproduce with the latest changes.

@Yadunund
Copy link
Member

Ah I was wrong. This is failing still.

@Yadunund Yadunund reopened this Dec 28, 2024
@Yadunund Yadunund added the bug Something isn't working label Dec 28, 2024
@YuanYuYuan
Copy link
Contributor Author

YuanYuYuan commented Dec 31, 2024

This one was solved by #368 before #365 was merged. We found that the root cause is a race condition in QueryingSubscriber when updating the CancelGoal in the test. 😃

It still doesn't solve this issue with the new advanced pub/sub.

@YuanYuYuan
Copy link
Contributor Author

Study

When running the following test:

/workspace/build/rclcpp_action/test_client

using rmw_zenoh (commit c0f538e1501f702e15f1826cc1e401c50a754046), the following error occurs:

/workspace/src/rclcpp/rclcpp_action/test/test_client.cpp:860: Failure
Expected equality of these values:
  rclcpp_action::GoalStatus::STATUS_CANCELED
    Which is: '\x5' (5)
  goal_handle0->get_status()
    Which is: '\x1' (1)

This indicates that the GoalStatus is not updated to CANCELED as expected.

Root Cause: Race Condition

The error occurs due to a race condition between the initial query and the subscriber callback in the AdvancedSubscriber. The problematic assertion can be found here.

The assertion relies on two actions:

  1. Sending the response for the CancelGoal request.
  2. Publishing a status update (GoalStatus) to the subscriber.

In rmw_zenoh, both publishing and subscribing occur within the same Zenoh session. Normally, the publish action triggers a local callback within the session, which is synchronous and use the same thread. This ensures the GoalStatus update happens before the CancelGoal response arrives.

However, if the Zenoh router is slow for some reason (falled into an abnormal status? we're still investigating why), the initial query from the subscriber (eitherAdvancedSubscriber or the deprecated QueryingSubscriber) might be delayed. In this case, the subscriber callback is executed asynchronously in a different thread. This delay can cause the action server to send the response before the GoalStatus is updated, leading to the assertion failure.

Observation

Adding a delay between dual_spin_until_future_complete and the assertion does not resolve the issue because:

  • The ROS task scheduling is directly affected by the spin in the test.
  • If the response handling is scheduled before the subscriber callback, the callback will still lag behind the status check.

To address this, a delay can be introduced in the client code). This delay allows the subscriber callback to execute first, ensuring the GoalStatus is updated before the response is delivered.

Discussion

Is it appropriate to assume that local pub/sub operations are synchronous for this test?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants