Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend drake_ros_core package. #6

Merged
merged 15 commits into from
Dec 6, 2021
Merged

Conversation

hidmic
Copy link
Collaborator

@hidmic hidmic commented Jun 9, 2021

Spin off from #2. Built on top of #5.


This change is Reviewable

@hidmic
Copy link
Collaborator Author

hidmic commented Jun 9, 2021

FYI @gbalke

@hidmic
Copy link
Collaborator Author

hidmic commented Jun 14, 2021

@IanTheEngineer friendly ping (or should I bump someone else for review?)

@hidmic hidmic changed the base branch from main to develop August 13, 2021 19:54
@hidmic hidmic force-pushed the hidmic/drake_ros_core_pybind branch from 20d5434 to 102651c Compare August 13, 2021 19:55
@hidmic hidmic force-pushed the hidmic/drake_ros_core_pybind branch from 102651c to 666139c Compare August 25, 2021 18:25
@hidmic
Copy link
Collaborator Author

hidmic commented Aug 25, 2021

Rebased to get linters and CI going.

@hidmic hidmic requested a review from sloretz August 25, 2021 18:26
@hidmic hidmic mentioned this pull request Aug 25, 2021
Copy link
Collaborator

@sloretz sloretz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 24 files at r1, 1 of 19 files at r3, 4 of 4 files at r4, 1 of 2 files at r5.
Reviewable status: 6 of 25 files reviewed, 3 unresolved discussions (waiting on @hidmic, @IanTheEngineer, and @sloretz)


drake_ros_bazel_installed/COLCON_IGNORE, line 0 at r5 (raw file):
Was this COLCON_IGNORE file accidentally committed?


drake_ros_core/src/python_bindings/py_serializer.hpp, line 55 at r4 (raw file):

    auto typesupport = py::cast<py::capsule>(
        py_get_typesupport(message_type, "_TYPE_SUPPORT"));
    // TODO(sloretz) use get_pointer() in py 2.6+

Oops, it looks like a%s/pybind11/py/ or similar mistakenly changed a few comments in this file.


drake_ros_core/src/python_bindings/qos_type_caster.hpp, line 147 at r4 (raw file):

      case RMW_QOS_POLICY_HISTORY_SYSTEM_DEFAULT:
      case RMW_QOS_POLICY_HISTORY_UNKNOWN:
        value.history(static_cast<rmw_qos_history_policy_t>(history));

For the fall through cases of history, reliability, durability, and liveliness - it seems unlikely, but isn't impossible that more enum values will be added in the future. If so this code would need to be updated.

How about making this code assume the value is valid by making the default case do a static_cast<...> and set the value? It looks like rclcpp makes the same assumption that everything is valid.

Copy link
Collaborator Author

@hidmic hidmic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 3 of 25 files reviewed, 3 unresolved discussions (waiting on @IanTheEngineer and @sloretz)


drake_ros_core/src/python_bindings/py_serializer.hpp, line 55 at r4 (raw file):

Previously, sloretz (Shane Loretz) wrote…

Oops, it looks like a%s/pybind11/py/ or similar mistakenly changed a few comments in this file.

Fixed in b92ddd2.


drake_ros_core/src/python_bindings/qos_type_caster.hpp, line 147 at r4 (raw file):

Previously, sloretz (Shane Loretz) wrote…

For the fall through cases of history, reliability, durability, and liveliness - it seems unlikely, but isn't impossible that more enum values will be added in the future. If so this code would need to be updated.

How about making this code assume the value is valid by making the default case do a static_cast<...> and set the value? It looks like rclcpp makes the same assumption that everything is valid.

That's fair. See f995b18.


drake_ros_bazel_installed/COLCON_IGNORE, line at r5 (raw file):

Previously, sloretz (Shane Loretz) wrote…

Was this COLCON_IGNORE file accidentally committed?

Indeed. Removed in 05556b7.

@hidmic hidmic requested a review from sloretz August 26, 2021 15:06
Copy link
Collaborator Author

@hidmic hidmic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just realized this patch was missing a Python linter. Configured pycodestyle to match Drake's configuration (default).

Reviewable status: 1 of 26 files reviewed, 3 unresolved discussions (waiting on @IanTheEngineer and @sloretz)

Copy link
Collaborator

@sloretz sloretz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 6 of 7 files at r6.
Reviewable status: 7 of 26 files reviewed, 1 unresolved discussion (waiting on @hidmic, @IanTheEngineer, and @sloretz)


drake_ros_core/pycodestyle.ini, line 2 at r6 (raw file):

[pycodestyle]
# default ignore

Upstream pycodestyle ignores W504 by default, while upstream Drake does not ignore it.

Copy link
Collaborator Author

@hidmic hidmic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 6 of 26 files reviewed, 1 unresolved discussion (waiting on @IanTheEngineer and @sloretz)


drake_ros_core/pycodestyle.ini, line 2 at r6 (raw file):

Previously, sloretz (Shane Loretz) wrote…

Upstream pycodestyle ignores W504 by default, while upstream Drake does not ignore it.

Aha! I missed that. Judging by commit dates, it was intentional. Updated in 9fe467b.

Copy link
Collaborator

@sloretz sloretz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 1 of 1 files at r7.
Reviewable status: 7 of 26 files reviewed, all discussions resolved (waiting on @IanTheEngineer and @sloretz)

Copy link
Member

@IanTheEngineer IanTheEngineer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First pass. In general these bindings look quite good. I have a few small-ish nits and discussion points below.

Reviewed 1 of 19 files at r2, 17 of 19 files at r3, 1 of 4 files at r4, 4 of 7 files at r6, 1 of 1 files at r7.
Reviewable status: all files reviewed, 5 unresolved discussions (waiting on @hidmic)


drake_ros_core/src/python_bindings/module_drake_ros_core.cpp, line 1 at r2 (raw file):

// Copyright 2020 Open Source Robotics Foundation, Inc.

nit: 2021 copyright. Ditto for the rest of the files.


drake_ros_core/src/python_bindings/module_drake_ros_core.cpp, line 32 at r2 (raw file):

using drake::systems::Diagram;

This using declaration appears to be unused.


drake_ros_core/src/python_bindings/module_drake_ros_core.cpp, line 36 at r2 (raw file):

using drake_ros_core::DrakeRos;

Similarly unused.


drake_ros_core/src/python_bindings/module_drake_ros_core.cpp, line 48 at r7 (raw file):

  py::module::import("pydrake.multibody.plant");

  // Use std::shared_ptr holder so pybind11 doesn't try to delete interfaces

Minor: Is there


drake_ros_core/src/python_bindings/module_drake_ros_core.cpp, line 50 at r7 (raw file):

  // Use std::shared_ptr holder so pybind11 doesn't try to delete interfaces
  // returned from get_ros_interface
  py::class_<DrakeRosInterface, std::shared_ptr<DrakeRosInterface>>(

Minor: Drake avoids using std:shared_ptr's, even in python bindings. As this is the ROS API, we could make a different design decision here. I would like some input from @EricCousineau-TRI as he has thought about this quite a bit: RobotLocomotion/drake#13058

Copy link
Collaborator Author

@hidmic hidmic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 22 of 26 files reviewed, 4 unresolved discussions (waiting on @EricCousineau-TRI, @IanTheEngineer, and @sloretz)


drake_ros_core/src/python_bindings/module_drake_ros_core.cpp, line 1 at r2 (raw file):

Previously, IanTheEngineer (Ian McMahon) wrote…

nit: 2021 copyright. Ditto for the rest of the files.

Done in f2e6598


drake_ros_core/src/python_bindings/module_drake_ros_core.cpp, line 32 at r2 (raw file):

Previously, IanTheEngineer (Ian McMahon) wrote…
using drake::systems::Diagram;

This using declaration appears to be unused.

Done in e696974


drake_ros_core/src/python_bindings/module_drake_ros_core.cpp, line 36 at r2 (raw file):

Previously, IanTheEngineer (Ian McMahon) wrote…
using drake_ros_core::DrakeRos;

Similarly unused.

It is used below to initialize RosInterfaceSystem


drake_ros_core/src/python_bindings/module_drake_ros_core.cpp, line 48 at r7 (raw file):

Previously, IanTheEngineer (Ian McMahon) wrote…

Minor: Is there

I'm not sure I follow.


drake_ros_core/src/python_bindings/module_drake_ros_core.cpp, line 50 at r7 (raw file):

Previously, IanTheEngineer (Ian McMahon) wrote…

Minor: Drake avoids using std:shared_ptr's, even in python bindings. As this is the ROS API, we could make a different design decision here. I would like some input from @EricCousineau-TRI as he has thought about this quite a bit: RobotLocomotion/drake#13058

If we're strict with ownership, the current design somewhat forces an std::shared_ptr. There are no lifetime constraints between RosSuscriberSystem or RosPublisherSystem and DrakeRosInterface.

Having said, I guess we could switch to passing RosInterfaceSystem raw pointers under the assumption that the lifetime of these systems is managed by a Diagram (if we're careful about not making assumptions when it comes to destruction order). Alternatively, if we assume that Diagram instances help implement ROS nodes (which is how I would use it, and not the other way around), perhaps we don't need a RosInterfaceSystem.

I guess it really depends on the use cases.

Copy link
Collaborator

@EricCousineau-TRI EricCousineau-TRI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 22 of 26 files reviewed, 5 unresolved discussions (waiting on @hidmic, @IanTheEngineer, and @sloretz)


drake_ros_core/src/python_bindings/module_drake_ros_core.cpp, line 50 at r7 (raw file):

Previously, hidmic (Michel Hidalgo) wrote…

If we're strict with ownership, the current design somewhat forces an std::shared_ptr. There are no lifetime constraints between RosSuscriberSystem or RosPublisherSystem and DrakeRosInterface.

Having said, I guess we could switch to passing RosInterfaceSystem raw pointers under the assumption that the lifetime of these systems is managed by a Diagram (if we're careful about not making assumptions when it comes to destruction order). Alternatively, if we assume that Diagram instances help implement ROS nodes (which is how I would use it, and not the other way around), perhaps we don't need a RosInterfaceSystem.

I guess it really depends on the use cases.

My read on DrakeRosInterface is that it's similar in spirit to DrakeLcmInterface:
https://github.com/RobotLocomotion/drake/blob/v0.34.0/lcm/drake_lcm_interface.h

In this case, we use raw pointers to alias things, as this is more of a Drake-centric usage.

For consistency, unique_ptr<> and T* is ideal; it does require the use of py::keep_alive<>, but at least we don't depart from Drake-style usages.


drake_ros_core/src/python_bindings/qos_type_caster.hpp, line 33 at r8 (raw file):

namespace detail {
template <>
struct type_caster<drake_ros_core::QoS> {

Is this still necessary given newer rclpy Python bindings?

Copy link
Collaborator Author

@hidmic hidmic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 16 of 26 files reviewed, 5 unresolved discussions (waiting on @EricCousineau-TRI, @IanTheEngineer, and @sloretz)


drake_ros_core/src/python_bindings/module_drake_ros_core.cpp, line 50 at r7 (raw file):

Previously, EricCousineau-TRI (Eric Cousineau) wrote…

My read on DrakeRosInterface is that it's similar in spirit to DrakeLcmInterface:
https://github.com/RobotLocomotion/drake/blob/v0.34.0/lcm/drake_lcm_interface.h

In this case, we use raw pointers to alias things, as this is more of a Drake-centric usage.

For consistency, unique_ptr<> and T* is ideal; it does require the use of py::keep_alive<>, but at least we don't depart from Drake-style usages.

Fair enough. See 4615610.


drake_ros_core/src/python_bindings/qos_type_caster.hpp, line 33 at r8 (raw file):

Previously, EricCousineau-TRI (Eric Cousineau) wrote…

Is this still necessary given newer rclpy Python bindings?

It is. QoSProfile instances are still pure Python. See https://github.com/ros2/rclpy/blob/6f4e42cb4b409cbd79273dd9dd12cc35198ff138/rclpy/rclpy/qos.py#L57.

Copy link
Member

@IanTheEngineer IanTheEngineer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: Thanks for this, Michel! These python bindings look good to me at this point.

Reviewed 3 of 4 files at r8, 7 of 7 files at r9.
Reviewable status: all files reviewed, 3 unresolved discussions (waiting on @EricCousineau-TRI and @hidmic)


drake_ros_core/src/python_bindings/module_drake_ros_core.cpp, line 48 at r7 (raw file):

Previously, hidmic (Michel Hidalgo) wrote…

I'm not sure I follow.

Ok, that make sense! It was a typo, starting the below comment.


drake_ros_core/src/python_bindings/module_drake_ros_core.cpp, line 50 at r7 (raw file):

Previously, hidmic (Michel Hidalgo) wrote…

Fair enough. See 4615610.

Thanks for iterating on this. This better conforms to the Drake style of pointer usage.


drake_ros_core/src/python_bindings/qos_type_caster.hpp, line 33 at r8 (raw file):

Previously, hidmic (Michel Hidalgo) wrote…

It is. QoSProfile instances are still pure Python. See https://github.com/ros2/rclpy/blob/6f4e42cb4b409cbd79273dd9dd12cc35198ff138/rclpy/rclpy/qos.py#L57.

That is quite interesting. As it is, this totally acceptable for this PR, but we may consider upstreaming these pybind11 bindings to rclpy.


drake_ros_core/test/test_pub_sub.py, line 34 at r9 (raw file):

    builder = DiagramBuilder()

    sys_ros_interface = builder.AddSystem(RosInterfaceSystem())

Nit: Preference to verbose full words or typical abbreviations as variable names (even if it makes the variable longer). Here ros_interface_system probably makes sense.

Copy link
Collaborator Author

@hidmic hidmic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 24 of 26 files reviewed, 2 unresolved discussions (waiting on @EricCousineau-TRI and @IanTheEngineer)


drake_ros_core/src/python_bindings/qos_type_caster.hpp, line 33 at r8 (raw file):

Previously, IanTheEngineer (Ian McMahon) wrote…

That is quite interesting. As it is, this totally acceptable for this PR, but we may consider upstreaming these pybind11 bindings to rclpy.

Not sure if it'll be the exact same, but we can try to, definitely.


drake_ros_core/test/test_pub_sub.py, line 34 at r9 (raw file):

Previously, IanTheEngineer (Ian McMahon) wrote…

Nit: Preference to verbose full words or typical abbreviations as variable names (even if it makes the variable longer). Here ros_interface_system probably makes sense.

Sounds reasonable. See 7f2fd0a

@hidmic
Copy link
Collaborator Author

hidmic commented Oct 27, 2021

+@EricCousineau-TRI for pending review

Copy link
Collaborator

@EricCousineau-TRI EricCousineau-TRI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First pass done on my side, PTAL!

Reviewed 2 of 7 files at r6, 1 of 1 files at r7, 6 of 7 files at r9, 2 of 2 files at r10.
Reviewable status: all files reviewed, 20 unresolved discussions (waiting on @hidmic and @IanTheEngineer)

a discussion (no related file):
nit Filenames here use *.{hpp,cpp}, whereas Drake uses *.{h,cc}.

Is there a strong reason why?
If not, can we more closely align with Drake?

Sorry for just now noticing this; can file an issue s.t. it could be handled outside of this PR.



drake_ros_core/CMakeLists.txt, line 67 at r10 (raw file):

# Python bindings
###
pybind11_add_module(py_drake_ros_core SHARED

nit Target naming convention unclear to me.

I would assume prefix to be package name, but this seems to indicate it's for Python components of drake_ros_core.

Is this an ament convention?
If not, would it be possible to pivot the ordering? e.g. drake_ros_core_py?
This is slightly closer to the conventions we use for Drake (and Anzu) Python targets:
https://drake.mit.edu/doxygen_cxx/group__python__bindings.html (see the "Target Conventions" documentation)


drake_ros_core/CMakeLists.txt, line 80 at r10 (raw file):

# Sets PYTHON_INSTALL_DIR
_ament_cmake_python_get_python_install_dir()

nit This seems to dip into private implementation guts of ament_cmake_python without a TODO or rationale.

Can you provide rationale, perhaps upstream issue / PR?


drake_ros_core/pycodestyle.ini, line 2 at r10 (raw file):

[pycodestyle]
ignore = 'E121,E123,E126,E226,E24,E704,W503'

nit Unclear why these ignores are selected.

Can a terse comment be provided (so that parity can be "tracked")?
e.g. "This is close to the ignore's that Drake uses, as of https://github.com/RobotLocomotion/drake/blob/v0.35.0/tools/lint/python_lint.bzl#L7"


drake_ros_core/src/python_bindings/module_drake_ros_core.cpp, line 60 at r10 (raw file):

      .def(py::init([](pybind11::object type, const char* topic_name,
                       drake_ros_core::QoS qos,
                       DrakeRosInterface* ros_interface) {

This binding needs a keep_alive for ros_interface.

Related example:
https://github.com/RobotLocomotion/drake/blob/b026886f559c2cba1b450d65f53d123adf106ef5/bindings/pydrake/systems/analysis_py.cc#L113-L117


drake_ros_core/src/python_bindings/module_drake_ros_core.cpp, line 71 at r10 (raw file):

      .def(
          py::init([](pybind11::object type, const char* topic_name,
                      drake_ros_core::QoS qos, DrakeRosInterface* ros_interface,

This needs a keep_alive, same as above.


drake_ros_core/src/python_bindings/module_drake_ros_core.cpp, line 84 at r10 (raw file):

      .def(py::init([](pybind11::object type, const char* topic_name,
                       drake_ros_core::QoS qos,
                       DrakeRosInterface* ros_interface) {

This needs a keep_alive, same as above.


drake_ros_core/src/python_bindings/py_serializer.hpp, line 1 at r10 (raw file):

// Copyright 2021 Open Source Robotics Foundation, Inc.

nit This appears to be a utility file for pybind. In Drake, we tend to suffix files like this with _pybind.h:
https://drake.mit.edu/doxygen_cxx/group__python__bindings.html

Can this be renamed accordingly?


drake_ros_core/src/python_bindings/py_serializer.hpp, line 14 at r10 (raw file):

// See the License for the specific language governing permissions and
// limitations under the License.
#ifndef PYTHON_BINDINGS__PY_SERIALIZER_HPP_

nit Unclear why we use ifdef header guards versus #pragma once.

Drake style guide uses #pragma once:
https://drake.mit.edu/styleguide/cppguide.html#The__pragma_once_Guard


drake_ros_core/src/python_bindings/py_serializer.hpp, line 29 at r10 (raw file):

namespace drake_ros_core {
// Serialize/Deserialize Python ROS types to rclcpp::SerializedMessage
class PySerializer : public SerializerInterface {

nit A lot of this appears to be tightly linked to


drake_ros_core/src/python_bindings/py_serializer.hpp, line 34 at r10 (raw file):

    py::dict scope;
    py::exec(
        R"delim(

nit For multiline raw "heredoc" in Drake C++, we use strings of the form R"""(text)""":
https://drake.mit.edu/styleguide/cppguide.html#Raw_String_Literals


drake_ros_core/src/python_bindings/py_serializer.hpp, line 160 at r10 (raw file):

 private:
  py::object message_type_;
  rosidl_message_type_support_t* type_support_;

All of these values should be initialized to nullptr.

See: https://drake.mit.edu/styleguide/cppguide.html#Variable_and_Array_Initialization


drake_ros_core/src/python_bindings/qos_type_caster.hpp, line 33 at r8 (raw file):

Previously, hidmic (Michel Hidalgo) wrote…

Not sure if it'll be the exact same, but we can try to, definitely.

Is it possible to mention as a TODO?


drake_ros_core/src/python_bindings/qos_type_caster.hpp, line 1 at r10 (raw file):

// Copyright 2021 Open Source Robotics Foundation, Inc.

nit Same as above - this is a utility file for pybind.

Can this file have the suffix _pybind.h to align with Drake target conventions?


drake_ros_core/src/python_bindings/qos_type_caster.hpp, line 23 at r10 (raw file):

namespace drake_ros_core {
// Thin wrapper that adds default constructor

nit Unclear why default constructor is necessary.

Is it because upstream leaves it completely uninitialized?
Or is the default constructor deleted?

FWIW There is a way to write the type_caster<> to not need a shim type.


drake_ros_core/src/python_bindings/qos_type_caster.hpp, line 33 at r10 (raw file):

namespace detail {
template <>
struct type_caster<drake_ros_core::QoS> {

nit In Drake, when writing custom type casters, we typically write the main code in Drake, and then use thin specialization code in pybind11::detail namespace.

Example:
https://github.com/RobotLocomotion/drake/blob/b026886f559c2cba1b450d65f53d123adf106ef5/bindings/pydrake/common/cpp_param_pybind.h#L152-L155


drake_ros_core/src/python_bindings/qos_type_caster.hpp, line 67 at r10 (raw file):

    // Clean up references when function exits.
    std::unique_ptr<void, std::function<void(void*)>> scope_exit(

You should use py::object if you want reference counting to be cleaned up, rather than doing piecemeal cleanup?


drake_ros_core/src/python_bindings/qos_type_caster.hpp, line 94 at r10 (raw file):

    // Get all the QoS Attributes
    py_history = PyObject_GetAttrString(source, "history");

This appears to checking both the contract and the values; in fact, this seems to silently allow partial contract adherence (which seems like a bad thing?).

Rather than doing this, can you do something like:

py::handle rclpy_cls =py::module::import("some_module").attr("QoS");
if (!py::isinstance(source, rclpy)) {
  return false;
}
// Now all future statements *must* succeed, or throw an error - both retreiving value, and casting.

drake_ros_core/test/test_pub_sub.py, line 15 at r10 (raw file):

# limitations under the License.

from drake_ros_core import RosInterfaceSystem

nit It's better to include first-party stuff after third party. Ideally, you have:

import pydrake
import rclpy
...
import drake_ros_core

Copy link
Collaborator Author

@hidmic hidmic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 3 of 28 files reviewed, 17 unresolved discussions (waiting on @EricCousineau-TRI, @IanTheEngineer, and @sloretz)

a discussion (no related file):

Previously, EricCousineau-TRI (Eric Cousineau) wrote…

nit Filenames here use *.{hpp,cpp}, whereas Drake uses *.{h,cc}.

Is there a strong reason why?
If not, can we more closely align with Drake?

Sorry for just now noticing this; can file an issue s.t. it could be handled outside of this PR.

We've already done non-Python binding changes in here. I'll make sure to curate commits so we can rebase instead of squashing when we merge. See 5f40fa1.



drake_ros_core/CMakeLists.txt, line 67 at r10 (raw file):

Previously, EricCousineau-TRI (Eric Cousineau) wrote…

nit Target naming convention unclear to me.

I would assume prefix to be package name, but this seems to indicate it's for Python components of drake_ros_core.

Is this an ament convention?
If not, would it be possible to pivot the ordering? e.g. drake_ros_core_py?
This is slightly closer to the conventions we use for Drake (and Anzu) Python targets:
https://drake.mit.edu/doxygen_cxx/group__python__bindings.html (see the "Target Conventions" documentation)

I don't think there's any convention w.r.t CMake target names. Changed in 1d969ea


drake_ros_core/CMakeLists.txt, line 80 at r10 (raw file):

Previously, EricCousineau-TRI (Eric Cousineau) wrote…

nit This seems to dip into private implementation guts of ament_cmake_python without a TODO or rationale.

Can you provide rationale, perhaps upstream issue / PR?

There's a public function for us to use. Switched to it in 1d969ea


drake_ros_core/pycodestyle.ini, line 2 at r10 (raw file):

Previously, EricCousineau-TRI (Eric Cousineau) wrote…

nit Unclear why these ignores are selected.

Can a terse comment be provided (so that parity can be "tracked")?
e.g. "This is close to the ignore's that Drake uses, as of https://github.com/RobotLocomotion/drake/blob/v0.35.0/tools/lint/python_lint.bzl#L7"

Done in aaf2a05


drake_ros_core/test/test_pub_sub.py, line 15 at r10 (raw file):

Previously, EricCousineau-TRI (Eric Cousineau) wrote…

nit It's better to include first-party stuff after third party. Ideally, you have:

import pydrake
import rclpy
...
import drake_ros_core

Sounds good. Done in c7417e7.


drake_ros_core/src/python_bindings/qos_type_caster.hpp, line 33 at r8 (raw file):

Previously, EricCousineau-TRI (Eric Cousineau) wrote…

Is it possible to mention as a TODO?

Sure. Done in 91389be


drake_ros_core/src/python_bindings/qos_type_caster.hpp, line 1 at r10 (raw file):

Previously, EricCousineau-TRI (Eric Cousineau) wrote…

nit Same as above - this is a utility file for pybind.

Can this file have the suffix _pybind.h to align with Drake target conventions?

Done in b743312


drake_ros_core/src/python_bindings/qos_type_caster.hpp, line 23 at r10 (raw file):

FWIW There is a way to write the type_caster<> to not need a shim type.

Hmm I bet we were not aware of that. pybind/pybind11#864 & co suggest a default-constructible C++ type is required unless PYBIND11_TYPE_CASTER is written out manually?

Yeap, that seems to be the case https://github.com/pybind/pybind11/blob/e7c9753f1d35061d137ac3ce561e94a7407e5583/include/pybind11/cast.h#L88.


drake_ros_core/src/python_bindings/qos_type_caster.hpp, line 33 at r10 (raw file):

Previously, EricCousineau-TRI (Eric Cousineau) wrote…

nit In Drake, when writing custom type casters, we typically write the main code in Drake, and then use thin specialization code in pybind11::detail namespace.

Example:
https://github.com/RobotLocomotion/drake/blob/b026886f559c2cba1b450d65f53d123adf106ef5/bindings/pydrake/common/cpp_param_pybind.h#L152-L155

Thing is, PYBIND11_TYPE_CASTER assumes it is within pybind11 namespaces. The example you mention relies on drake::pydrake::internal::type_caster_wrapped (https://github.com/RobotLocomotion/drake/blob/e59b7fc18dbe80b827d07e4a3283a0c87eda7021/bindings/pydrake/common/wrap_pybind.h#L61), which we don't have access to and apparently foregoes PYBIND11_TYPE_CASTER entirely.

Do you want to do the same here?


drake_ros_core/src/python_bindings/qos_type_caster.hpp, line 67 at r10 (raw file):

Previously, EricCousineau-TRI (Eric Cousineau) wrote…

You should use py::object if you want reference counting to be cleaned up, rather than doing piecemeal cleanup?

Fully agreed. Changed in 91389be


drake_ros_core/src/python_bindings/qos_type_caster.hpp, line 94 at r10 (raw file):

Previously, EricCousineau-TRI (Eric Cousineau) wrote…

This appears to checking both the contract and the values; in fact, this seems to silently allow partial contract adherence (which seems like a bad thing?).

Rather than doing this, can you do something like:

py::handle rclpy_cls =py::module::import("some_module").attr("QoS");
if (!py::isinstance(source, rclpy)) {
  return false;
}
// Now all future statements *must* succeed, or throw an error - both retreiving value, and casting.

Agreed. Added in 91389be


drake_ros_core/src/python_bindings/module_drake_ros_core.cpp, line 60 at r10 (raw file):

Previously, EricCousineau-TRI (Eric Cousineau) wrote…

This binding needs a keep_alive for ros_interface.

Related example:
https://github.com/RobotLocomotion/drake/blob/b026886f559c2cba1b450d65f53d123adf106ef5/bindings/pydrake/systems/analysis_py.cc#L113-L117

Actually, it doesn't. Neither RosPublisherSystem nor RosSubscriberSystem keep a reference to the DrakeRosInterface instance. That's because deep down the underlying rcl instances are shared (e.g. https://github.com/ros2/rclcpp/blob/94264320b4a2fde8999b895824d2c85cf0e4d2ac/rclcpp/include/rclcpp/publisher_base.hpp#L225).


drake_ros_core/src/python_bindings/module_drake_ros_core.cpp, line 71 at r10 (raw file):

Previously, EricCousineau-TRI (Eric Cousineau) wrote…

This needs a keep_alive, same as above.

Same as above.


drake_ros_core/src/python_bindings/module_drake_ros_core.cpp, line 84 at r10 (raw file):

Previously, EricCousineau-TRI (Eric Cousineau) wrote…

This needs a keep_alive, same as above.

Same as above.


drake_ros_core/src/python_bindings/py_serializer.hpp, line 1 at r10 (raw file):

Previously, EricCousineau-TRI (Eric Cousineau) wrote…

nit This appears to be a utility file for pybind. In Drake, we tend to suffix files like this with _pybind.h:
https://drake.mit.edu/doxygen_cxx/group__python__bindings.html

Can this be renamed accordingly?

Sure. Done in b743312


drake_ros_core/src/python_bindings/py_serializer.hpp, line 14 at r10 (raw file):

Previously, EricCousineau-TRI (Eric Cousineau) wrote…

nit Unclear why we use ifdef header guards versus #pragma once.

Drake style guide uses #pragma once:
https://drake.mit.edu/styleguide/cppguide.html#The__pragma_once_Guard

ROS 2 styles creeping in. Fixed them all in 5f40fa1


drake_ros_core/src/python_bindings/py_serializer.hpp, line 29 at r10 (raw file):

Previously, EricCousineau-TRI (Eric Cousineau) wrote…

nit A lot of this appears to be tightly linked to

... Python generated messages and their typesupport? Yes.


drake_ros_core/src/python_bindings/py_serializer.hpp, line 34 at r10 (raw file):

Previously, EricCousineau-TRI (Eric Cousineau) wrote…

nit For multiline raw "heredoc" in Drake C++, we use strings of the form R"""(text)""":
https://drake.mit.edu/styleguide/cppguide.html#Raw_String_Literals

Done in 65b15cd


drake_ros_core/src/python_bindings/py_serializer.hpp, line 160 at r10 (raw file):

Previously, EricCousineau-TRI (Eric Cousineau) wrote…

All of these values should be initialized to nullptr.

See: https://drake.mit.edu/styleguide/cppguide.html#Variable_and_Array_Initialization

Good point. Done in 65b15cd

Copy link
Collaborator

@EricCousineau-TRI EricCousineau-TRI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: with just a few more nits

Reviewed 24 of 25 files at r11.
Reviewable status: 27 of 28 files reviewed, 9 unresolved discussions (waiting on @EricCousineau-TRI, @hidmic, and @IanTheEngineer)

a discussion (no related file):

Previously, hidmic (Michel Hidalgo) wrote…

We've already done non-Python binding changes in here. I'll make sure to curate commits so we can rebase instead of squashing when we merge. See 5f40fa1.

OK Thanks!



drake_ros_core/src/python_bindings/qos_type_caster_pybind.h, line 31 at r11 (raw file):

namespace pybind11 {
namespace detail {
// TODO(hidmic): rely on rclpy.qos.QoSProfile

nit Can you file an issue and reference it here?


drake_ros_core/src/python_bindings/qos_type_caster.hpp, line 23 at r10 (raw file):

Hmm I bet we were not aware of that.

That being said, can you add a comment to make it obvious why default constructor is desired? e.g. ... since rclcpp::QoS deletes its default constructor?


drake_ros_core/src/python_bindings/qos_type_caster.hpp, line 33 at r10 (raw file):

Previously, hidmic (Michel Hidalgo) wrote…

Thing is, PYBIND11_TYPE_CASTER assumes it is within pybind11 namespaces. The example you mention relies on drake::pydrake::internal::type_caster_wrapped (https://github.com/RobotLocomotion/drake/blob/e59b7fc18dbe80b827d07e4a3283a0c87eda7021/bindings/pydrake/common/wrap_pybind.h#L61), which we don't have access to and apparently foregoes PYBIND11_TYPE_CASTER entirely.

Do you want to do the same here?

OK Yeah, forgot about how much of the weirdness is in there. Nah, it's fine to leave as-is for now.


drake_ros_core/src/python_bindings/module_drake_ros_core.cpp, line 60 at r10 (raw file):

Previously, hidmic (Michel Hidalgo) wrote…

Actually, it doesn't. Neither RosPublisherSystem nor RosSubscriberSystem keep a reference to the DrakeRosInterface instance. That's because deep down the underlying rcl instances are shared (e.g. https://github.com/ros2/rclcpp/blob/94264320b4a2fde8999b895824d2c85cf0e4d2ac/rclcpp/include/rclcpp/publisher_base.hpp#L225).

OK Ah, gotcha. Thanks for clarifying!


drake_ros_core/src/python_bindings/py_serializer.hpp, line 29 at r10 (raw file):

Previously, hidmic (Michel Hidalgo) wrote…

... Python generated messages and their typesupport? Yes.

Ah, sorry, incomplete thought.

But yeah, a lot of it smells like Python code written in pybind11.
Ideally, a lot of this should just happen in Python itself as is done in Drake:
https://github.com/RobotLocomotion/drake/blob/4696afeacbbd87cfc962f10404f74c03f2eea575/bindings/pydrake/systems/lcm_py.cc#L209
https://github.com/RobotLocomotion/drake/blob/4696afeacbbd87cfc962f10404f74c03f2eea575/bindings/pydrake/systems/_lcm_extra.py#L7-L30

Consider adding TODO to that effect?


drake_ros_core/src/python_bindings/py_serializer_pybind.h, line 87 at r11 (raw file):

    py::dict scope;
    scope["av"] = &abstract_value;
    py::object message = py::eval("av.Clone().get_mutable_value()", scope);

nit py::eval() seems a little heavy-handed (requires parsing each time).

Instead, consider breaking it down (and using non-mutable version):

py::object abstract_value_py = py::cast(&abstract_value);
py::object message = abstract_value_py.attr("get_value")();

Alternatively, consider writing most of this code in Python itself (see above).


drake_ros_core/src/python_bindings/py_serializer_pybind.h, line 87 at r11 (raw file):

    py::dict scope;
    scope["av"] = &abstract_value;
    py::object message = py::eval("av.Clone().get_mutable_value()", scope);

nit Unclear why this is mutable.

Can this just access const value of message?


drake_ros_core/src/python_bindings/py_serializer_pybind.h, line 112 at r11 (raw file):

  bool deserialize(const rclcpp::SerializedMessage& serialized_message,
                   drake::AbstractValue& abstract_value) const override {
    // TODO(sloretz) it would be so much more convenient if I didn't have to

nit I'm not sure if fully understand this.

Is it more so "why doesn't RMW support to-and-from bytes?"


drake_ros_core/src/python_bindings/py_serializer_pybind.h, line 113 at r11 (raw file):

                   drake::AbstractValue& abstract_value) const override {
    // TODO(sloretz) it would be so much more convenient if I didn't have to
    // care that the Python typesupport used the C type support internally.

nit Can this be filed as upstream issue?

Copy link
Collaborator Author

@hidmic hidmic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All comments addresed, PTAL!

Reviewable status: 17 of 31 files reviewed, 7 unresolved discussions (waiting on @EricCousineau-TRI and @IanTheEngineer)


drake_ros_core/src/python_bindings/qos_type_caster_pybind.h, line 31 at r11 (raw file):

Previously, EricCousineau-TRI (Eric Cousineau) wrote…

nit Can you file an issue and reference it here?

Done in 021f259


drake_ros_core/src/python_bindings/qos_type_caster.hpp, line 23 at r10 (raw file):

Previously, EricCousineau-TRI (Eric Cousineau) wrote…

Hmm I bet we were not aware of that.

That being said, can you add a comment to make it obvious why default constructor is desired? e.g. ... since rclcpp::QoS deletes its default constructor?

Done in 021f259


drake_ros_core/src/python_bindings/py_serializer.hpp, line 29 at r10 (raw file):

Previously, EricCousineau-TRI (Eric Cousineau) wrote…

Ah, sorry, incomplete thought.

But yeah, a lot of it smells like Python code written in pybind11.
Ideally, a lot of this should just happen in Python itself as is done in Drake:
https://github.com/RobotLocomotion/drake/blob/4696afeacbbd87cfc962f10404f74c03f2eea575/bindings/pydrake/systems/lcm_py.cc#L209
https://github.com/RobotLocomotion/drake/blob/4696afeacbbd87cfc962f10404f74c03f2eea575/bindings/pydrake/systems/_lcm_extra.py#L7-L30

Consider adding TODO to that effect?

I ended up carrying through with the refactor. See 7e1bea0.


drake_ros_core/src/python_bindings/py_serializer_pybind.h, line 87 at r11 (raw file):

Previously, EricCousineau-TRI (Eric Cousineau) wrote…

nit py::eval() seems a little heavy-handed (requires parsing each time).

Instead, consider breaking it down (and using non-mutable version):

py::object abstract_value_py = py::cast(&abstract_value);
py::object message = abstract_value_py.attr("get_value")();

Alternatively, consider writing most of this code in Python itself (see above).

See 7e1bea0.


drake_ros_core/src/python_bindings/py_serializer_pybind.h, line 87 at r11 (raw file):

Previously, EricCousineau-TRI (Eric Cousineau) wrote…

nit Unclear why this is mutable.

Can this just access const value of message?

No longer applies. See 7e1bea0.


drake_ros_core/src/python_bindings/py_serializer_pybind.h, line 112 at r11 (raw file):

Previously, EricCousineau-TRI (Eric Cousineau) wrote…

nit I'm not sure if fully understand this.

Is it more so "why doesn't RMW support to-and-from bytes?"

No longer applies. See 7e1bea0.


drake_ros_core/src/python_bindings/py_serializer_pybind.h, line 113 at r11 (raw file):

Previously, EricCousineau-TRI (Eric Cousineau) wrote…

nit Can this be filed as upstream issue?

No longer applies. See 7e1bea0.

Copy link
Collaborator Author

@hidmic hidmic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @IanTheEngineer. I'll go ahead and cleanup commits so we can get this rebase-merged.

Reviewable status: all files reviewed, 5 unresolved discussions (waiting on @EricCousineau-TRI)


drake_ros_core/src/drake_ros.cc, line 72 at r17 (raw file):

Previously, IanTheEngineer (Ian McMahon) wrote…

FYI, I'm fine with leaving both spins at this point, assuming that spin_once(0) is a no-op?

Not necessarily, it might dispatch one item.

- Sort out public/private API boundaries
- Use raw pointers instead of std::shared_ptr
- Improve compliance with Drake's C++ style
- Extend Doxygen docstrings

Signed-off-by: Michel Hidalgo <[email protected]>
@hidmic hidmic force-pushed the hidmic/drake_ros_core_pybind branch from c20e2b5 to e8a7859 Compare November 9, 2021 18:10
@hidmic hidmic removed the bug Something isn't working label Nov 9, 2021
Copy link
Collaborator

@EricCousineau-TRI EricCousineau-TRI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 3 of 7 files at r19, 1 of 3 files at r20, 2 of 2 files at r21.
Reviewable status: all files reviewed, 11 unresolved discussions (waiting on @hidmic and @IanTheEngineer)


drake_ros_core/test/test_pub_sub.py, line 63 at r21 (raw file):

    # Create subscription listening to publisher system
    num_msgs = 5

(Same issues as above C++ code)


drake_ros_core/include/drake_ros_core/drake_ros.h, line 33 at r21 (raw file):

  /** A constructor that wraps a `node_name` ROS2 node with `node_options`.
   See `rclcpp::Node` documentation for further reference on arguments.

nit Node doesn't seem very comprehensive in terms of its overview, esp. in terms of below nit ("what is work?").

Can you refer to Executor as well?
https://docs.ros2.org/galactic/api/rclcpp/classrclcpp_1_1Executor.html#details


drake_ros_core/include/drake_ros_core/drake_ros.h, line 45 at r21 (raw file):

  rclcpp::Node* get_mutable_node() const;

  /** Spins the underlying ROS2 node, dispatching all available work.

nit Unclear what "available work" means exactly.

Can you add a brief inline of Executor's definition of "work"?
https://docs.ros2.org/galactic/api/rclcpp/classrclcpp_1_1Executor.html#details


drake_ros_core/include/drake_ros_core/drake_ros.h, line 48 at r21 (raw file):

   @param[in] timeout_millis Timeout, in milliseconds.
     If timeout is less than 0, the call will block indefinitely
     until some work has been dispatched. If timeout is 0, the call

"until some work has been dispatched" is unclear; will this wait for work to queue up, or only for it to be executed?

Consider rewording a bit (assuming my mental model is correct):

If timeout is less than 0, this will wait indefinitely until work has been queued and executed.
If timeout is 0, the call will not wait for new work, and simply execute all available work.
If timeout is larger than zero, the call will wait for and execute new and available work up the given timeout.

@see rclcpp::spin_some and rclcpp::spin_once

drake_ros_core/include/drake_ros_core/drake_ros.h, line 48 at r21 (raw file):

   @param[in] timeout_millis Timeout, in milliseconds.
     If timeout is less than 0, the call will block indefinitely
     until some work has been dispatched. If timeout is 0, the call

In writing prior comment, I'm confused as to why < 0 is allowed. I don't really see that in the docs.

Can we instead just require timeout_millis >= 0.


drake_ros_core/src/drake_ros.cc, line 72 at r17 (raw file):

Previously, hidmic (Michel Hidalgo) wrote…

Not necessarily, it might dispatch one item.

Yeah, can we just reduce this to executor->spin_some(timeout_thing) or spin_all?
Multiple calls confuse me quite a bit.


drake_ros_core/src/drake_ros.cc, line 72 at r18 (raw file):

void DrakeRos::Spin(int timeout_millis) {
  // To match `DrakeLcm::HandleSubscriptions()`'s behavior, in the following

This seems like a contract; can you hoist this to public docstring?


drake_ros_core/src/ros_interface_system.cc, line 45 at r14 (raw file):

Previously, hidmic (Michel Hidalgo) wrote…

Yeah. I meant that the executor/scheduler/dispatcher itself will not block waiting for subscriptions to handle. Whether subscriptions callbacks block or not is beyond its control.

Thanks for explanation! That being said, I'm still very confused about nuance between spin_once vs. spin_all. My read on the docs haven't shed much light.

What's the difference between spin_some(0) and spin_all(0)?
My guess is spin_some(0) first blocks for new work, while spin_all(0) doesn't?


drake_ros_core/test/test_pub_sub.cc, line 81 at r15 (raw file):

Previously, IanTheEngineer (Ian McMahon) wrote…

Good find and fix. Thanks!

OK Looks better!


drake_ros_core/test/test_pub_sub.cc, line 45 at r21 (raw file):

  auto ros_interface_system =
      builder.AddSystem<RosInterfaceSystem>(std::make_unique<DrakeRos>());
  auto ros_subscriber_system =

nit Names here are a bit confusing to distinguish connections.

Can you qualify names a bit more? Suggestions:

  • Drop ros_ prefix, since you don't use ros_publisher and ros_subscriber for direct entities
  • Denote what message it's connecting to in the variable, e.g. direct_pub_in -> system_sub_in -> system_pub_out -> direct_sub_out, renaming rx_callback to rx_callback_direct_sub_in, rx_msgs to rx_msgs_direct_sub_in, etc.

drake_ros_core/test/test_pub_sub.cc, line 78 at r21 (raw file):

    // Cope with lack of synchronization between subscriber
    // and publisher systems by ignoring duplicate messages.
    if (rx_msgs.empty() || rx_msgs.back()->uint64_value != msg->uint64_value) {

It was initially unclear why this subscriber would see duplicate messages, then I realized it's from a publisher in the system framework connected to an effectively latched output port (the subscriber system's received message).

Can you instead coordinate this setup s.t. synchronization is done purposefully on the outside?
I think that should simplify this logic.

For example:

  • Set periodic callback to dt_publish = 1.0
  • In for loop
    • Publish to direct_pub_in
    • AdvanceTo(t + dt_1) so that system_sub_in will trigger and queue up message. Assert no change in direct rx msgs.
    • AdvanceTo(t + dt_2) so that system_pub_out will trigger and publish message out, and we immediately check

Ultimately, dt_1 and dt2 should be able to be anything strictly positive that satisfies dt_1 + dt_2 == dt_publish.
My suggestion is dt_1 = dt_2 = dt_publish / 2


drake_ros_core/test/test_pub_sub.cc, line 87 at r21 (raw file):

  size_t num_msgs_sent = 0;
  const int timeout_sec = 5;
  const int spins_per_sec = 10;

nit Is it possible to prefer exactly representable floating point numbers, at least for testing code?
e.g. 1.0, 0.5, 0.25, ..., 1.0 / 32.0, https://docs.python.org/3/tutorial/floatingpoint.html

See: RobotLocomotion/drake#15850

Copy link
Collaborator Author

@hidmic hidmic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 27 of 32 files reviewed, 11 unresolved discussions (waiting on @EricCousineau-TRI and @IanTheEngineer)


drake_ros_core/test/test_pub_sub.py, line 63 at r21 (raw file):

Previously, EricCousineau-TRI (Eric Cousineau) wrote…

(Same issues as above C++ code)

See b56f073. Same word of caution as above.


drake_ros_core/include/drake_ros_core/drake_ros.h, line 33 at r21 (raw file):

Previously, EricCousineau-TRI (Eric Cousineau) wrote…

nit Node doesn't seem very comprehensive in terms of its overview, esp. in terms of below nit ("what is work?").

Can you refer to Executor as well?
https://docs.ros2.org/galactic/api/rclcpp/classrclcpp_1_1Executor.html#details

I can refer to the Executor class but it is disconnected from this particular constructor docs. How about adding it to the class docstring? See 8185909


drake_ros_core/include/drake_ros_core/drake_ros.h, line 45 at r21 (raw file):

Previously, EricCousineau-TRI (Eric Cousineau) wrote…

nit Unclear what "available work" means exactly.

Can you add a brief inline of Executor's definition of "work"?
https://docs.ros2.org/galactic/api/rclcpp/classrclcpp_1_1Executor.html#details

Good idea. See 8185909


drake_ros_core/include/drake_ros_core/drake_ros.h, line 48 at r21 (raw file):

Previously, EricCousineau-TRI (Eric Cousineau) wrote…

In writing prior comment, I'm confused as to why < 0 is allowed. I don't really see that in the docs.

Can we instead just require timeout_millis >= 0.

spin_once allows it, but I see LCM does not. I'll mimic what LCM does when timeout_millis < 0. See 8185909.


drake_ros_core/include/drake_ros_core/drake_ros.h, line 48 at r21 (raw file):

Previously, EricCousineau-TRI (Eric Cousineau) wrote…

"until some work has been dispatched" is unclear; will this wait for work to queue up, or only for it to be executed?

Consider rewording a bit (assuming my mental model is correct):

If timeout is less than 0, this will wait indefinitely until work has been queued and executed.
If timeout is 0, the call will not wait for new work, and simply execute all available work.
If timeout is larger than zero, the call will wait for and execute new and available work up the given timeout.

@see rclcpp::spin_some and rclcpp::spin_once

Done in 8185909


drake_ros_core/src/drake_ros.cc, line 72 at r17 (raw file):

Previously, EricCousineau-TRI (Eric Cousineau) wrote…

Yeah, can we just reduce this to executor->spin_some(timeout_thing) or spin_all?
Multiple calls confuse me quite a bit.

Thing is, neither spin_some(timeout) nor spin_all(timeout) with timeout > 0 will wait for work if there's none available. That deviates from DrakeLcm::HandleSubscriptions() behavior. If you guys are happy with that, I'll remove the spin_once invocation.


drake_ros_core/src/drake_ros.cc, line 72 at r18 (raw file):

Previously, EricCousineau-TRI (Eric Cousineau) wrote…

This seems like a contract; can you hoist this to public docstring?

Sounds good. Done in 8185909.


drake_ros_core/src/ros_interface_system.cc, line 45 at r14 (raw file):

Previously, EricCousineau-TRI (Eric Cousineau) wrote…

Thanks for explanation! That being said, I'm still very confused about nuance between spin_once vs. spin_all. My read on the docs haven't shed much light.

What's the difference between spin_some(0) and spin_all(0)?
My guess is spin_some(0) first blocks for new work, while spin_all(0) doesn't?

See https://github.com/ros2/rclcpp/blob/0781ea543cc523aa8ecae266847ab42900e2a1fd/rclcpp/src/rclcpp/executor.cpp#L423-L472.

In short,

  • spin_some(0) fetches all available work and processes it. It does not wait.
  • spin_all(0) fetches all available work and processes it. It then repeats that operation until fetching for work yields no work. It does not wait (and thus it will only fetch work more than once if more work arrives while it is processing the previous batch).

I agree docs could be better.


drake_ros_core/test/test_pub_sub.cc, line 45 at r21 (raw file):

Previously, EricCousineau-TRI (Eric Cousineau) wrote…

nit Names here are a bit confusing to distinguish connections.

Can you qualify names a bit more? Suggestions:

  • Drop ros_ prefix, since you don't use ros_publisher and ros_subscriber for direct entities
  • Denote what message it's connecting to in the variable, e.g. direct_pub_in -> system_sub_in -> system_pub_out -> direct_sub_out, renaming rx_callback to rx_callback_direct_sub_in, rx_msgs to rx_msgs_direct_sub_in, etc.

Sounds good. Done in b56f073


drake_ros_core/test/test_pub_sub.cc, line 78 at r21 (raw file):

Previously, EricCousineau-TRI (Eric Cousineau) wrote…

It was initially unclear why this subscriber would see duplicate messages, then I realized it's from a publisher in the system framework connected to an effectively latched output port (the subscriber system's received message).

Can you instead coordinate this setup s.t. synchronization is done purposefully on the outside?
I think that should simplify this logic.

For example:

  • Set periodic callback to dt_publish = 1.0
  • In for loop
    • Publish to direct_pub_in
    • AdvanceTo(t + dt_1) so that system_sub_in will trigger and queue up message. Assert no change in direct rx msgs.
    • AdvanceTo(t + dt_2) so that system_pub_out will trigger and publish message out, and we immediately check

Ultimately, dt_1 and dt2 should be able to be anything strictly positive that satisfies dt_1 + dt_2 == dt_publish.
My suggestion is dt_1 = dt_2 = dt_publish / 2

Fair enough. See b56f073.

As a word of caution, I'd generally avoid writing tests in ROS as if there was a deterministic, sequentially-consistent order of computation. There's not. Transport latency is not constant nor bound. There are reasonable expectations to be had, but those depend on multiple, varying factors (middleware impl., OS scheduler, netstack load, CPU load). A 1 sec publishing period will definitely make test flakes very unlikely (but not impossible e.g. think of some middleware impl. thread not getting scheduled on time for whatever reason).


drake_ros_core/test/test_pub_sub.cc, line 87 at r21 (raw file):

Previously, EricCousineau-TRI (Eric Cousineau) wrote…

nit Is it possible to prefer exactly representable floating point numbers, at least for testing code?
e.g. 1.0, 0.5, 0.25, ..., 1.0 / 32.0, https://docs.python.org/3/tutorial/floatingpoint.html

See: RobotLocomotion/drake#15850

No need to spin multiples times after b56f073

Copy link
Collaborator

@EricCousineau-TRI EricCousineau-TRI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 4 of 5 files at r22.
Reviewable status: 31 of 32 files reviewed, 8 unresolved discussions (waiting on @hidmic and @IanTheEngineer)


drake_ros_core/test/test_pub_sub.py, line 85 at r22 (raw file):

        # subscriber system. The drake ros publisher system should not publish
        # just yet.
        rclpy.spin_once(node, timeout_sec=0.5)

nit Why do we want / need such a large timeout here? Can you make this match C++ above? (timeout = 0)


drake_ros_core/include/drake_ros_core/drake_ros.h, line 33 at r21 (raw file):

Previously, hidmic (Michel Hidalgo) wrote…

I can refer to the Executor class but it is disconnected from this particular constructor docs. How about adding it to the class docstring? See 8185909

OK Looks good to me!


drake_ros_core/include/drake_ros_core/drake_ros.h, line 34 at r22 (raw file):

 public:
  /** A constructor that wraps a "drake_ros" ROS node with default options. */
  DrakeRos() : DrakeRos("drake_ros", rclcpp::NodeOptions{}) {}

Unclear why we want a default node name here.

Is it possible to remove this? (or was there a prior discussion motivating it?)


drake_ros_core/src/drake_ros.cc, line 72 at r17 (raw file):

Previously, hidmic (Michel Hidalgo) wrote…

Thing is, neither spin_some(timeout) nor spin_all(timeout) with timeout > 0 will wait for work if there's none available. That deviates from DrakeLcm::HandleSubscriptions() behavior. If you guys are happy with that, I'll remove the spin_once invocation.

Gotcha. Yeah, can you remove spin_once? And perhaps use spin_all() instead of spin_once()?

I believe LCM usages are more akin to spin_all(), in that more work could be queued up while the following loop can be running:
https://github.com/RobotLocomotion/drake/blob/bb3d0ffc1360123b4c9096274b6cbdc28c9335ef/lcm/drake_lcm.cc#L290-L297


drake_ros_core/src/drake_ros.cc, line 42 at r22 (raw file):

  impl_->context = node_options.context();
  if (impl_->context->is_valid()) {
    // Context is being init/shutdown outside of this system

nit "has been init'd and will be shutdown"

(or perhaps remove given below?)


drake_ros_core/src/drake_ros.cc, line 46 at r22 (raw file):

  } else {
    // This system will init/shutdown the context
    impl_->externally_init = false;

While this feature ("auto-init") seems useful, it also seems like it'll cause confusing down the line.

Can we instead just require that rcl*::init() (or any valid invocation similar that) has already been called?
e.g. in class doc overview:

@pre Requires that rcl* context has been initialized.

drake_ros_core/src/drake_ros.cc, line 52 at r22 (raw file):

  impl_->node.reset(new rclcpp::Node(node_name, node_options));

  // TODO(sloretz) allow passing in executor options

BTW It seems like the only need for sync for the executor is context. If that's the case, then we should just let users pass in an executor instance itself?


drake_ros_core/src/ros_interface_system.cc, line 45 at r14 (raw file):

Previously, hidmic (Michel Hidalgo) wrote…

See https://github.com/ros2/rclcpp/blob/0781ea543cc523aa8ecae266847ab42900e2a1fd/rclcpp/src/rclcpp/executor.cpp#L423-L472.

In short,

  • spin_some(0) fetches all available work and processes it. It does not wait.
  • spin_all(0) fetches all available work and processes it. It then repeats that operation until fetching for work yields no work. It does not wait (and thus it will only fetch work more than once if more work arrives while it is processing the previous batch).

I agree docs could be better.

OK That's starting to make more sense.


drake_ros_core/test/test_pub_sub.cc, line 78 at r21 (raw file):

Previously, hidmic (Michel Hidalgo) wrote…

Fair enough. See b56f073.

As a word of caution, I'd generally avoid writing tests in ROS as if there was a deterministic, sequentially-consistent order of computation. There's not. Transport latency is not constant nor bound. There are reasonable expectations to be had, but those depend on multiple, varying factors (middleware impl., OS scheduler, netstack load, CPU load). A 1 sec publishing period will definitely make test flakes very unlikely (but not impossible e.g. think of some middleware impl. thread not getting scheduled on time for whatever reason).

OK Gotcha. In practice, though, I'm not sure if we've seen a huge issue of LCM pub/sub via local loopback when within the same process, at least not for small "toy" payloads.


drake_ros_core/test/test_pub_sub.cc, line 87 at r21 (raw file):

Previously, hidmic (Michel Hidalgo) wrote…

No need to spin multiples times after b56f073

OK Awesome, thanks!


drake_ros_core/test/test_pub_sub.cc, line 42 at r22 (raw file):

  auto system_ros =
      builder.AddSystem<RosInterfaceSystem>(std::make_unique<DrakeRos>());

nit Related to below - can you assign this an explicit node name?


drake_ros_core/test/test_pub_sub.cc, line 66 at r22 (raw file):

  // Don't need to rclcpp::init because DrakeRos uses global rclcpp::Context by
  // default
  auto node = rclcpp::Node::make_shared("sub_to_pub");

nit Can you rename this variable to direct_ros or direct_node?


drake_ros_core/test/test_pub_sub.cc, line 66 at r22 (raw file):

  // Don't need to rclcpp::init because DrakeRos uses global rclcpp::Context by
  // default
  auto node = rclcpp::Node::make_shared("sub_to_pub");

nit Unclear why this node has a name different than above.

Copy link
Collaborator Author

@hidmic hidmic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 26 of 32 files reviewed, 5 unresolved discussions (waiting on @EricCousineau-TRI and @IanTheEngineer)


drake_ros_core/test/test_pub_sub.py, line 85 at r22 (raw file):

Previously, EricCousineau-TRI (Eric Cousineau) wrote…

nit Why do we want / need such a large timeout here? Can you make this match C++ above? (timeout = 0)

Argh, that's a leftover of a debugging session. Removed in 8e23c11.


drake_ros_core/include/drake_ros_core/drake_ros.h, line 34 at r22 (raw file):

Previously, EricCousineau-TRI (Eric Cousineau) wrote…

Unclear why we want a default node name here.

Is it possible to remove this? (or was there a prior discussion motivating it?)

No prior discussion. Done in 7db7f58


drake_ros_core/src/drake_ros.cc, line 72 at r17 (raw file):

Previously, EricCousineau-TRI (Eric Cousineau) wrote…

Gotcha. Yeah, can you remove spin_once? And perhaps use spin_all() instead of spin_once()?

I believe LCM usages are more akin to spin_all(), in that more work could be queued up while the following loop can be running:
https://github.com/RobotLocomotion/drake/blob/bb3d0ffc1360123b4c9096274b6cbdc28c9335ef/lcm/drake_lcm.cc#L290-L297

spin_all does not currently support a zero timeout. Not entirely clear why. Changed in 8836bc4 to spin_some only. We can submit a ticket upstream and try fight that API contract.


drake_ros_core/src/drake_ros.cc, line 42 at r22 (raw file):

Previously, EricCousineau-TRI (Eric Cousineau) wrote…

nit "has been init'd and will be shutdown"

(or perhaps remove given below?)

Done in da5dcfb.


drake_ros_core/src/drake_ros.cc, line 46 at r22 (raw file):

Previously, EricCousineau-TRI (Eric Cousineau) wrote…

While this feature ("auto-init") seems useful, it also seems like it'll cause confusing down the line.

Can we instead just require that rcl*::init() (or any valid invocation similar that) has already been called?
e.g. in class doc overview:

@pre Requires that rcl* context has been initialized.

Hmm, I think that won't work for Python code. While both rclpy and rclcpp contexts eventually wrap rcl's, they are not thin wrappers nor is rclpy a set of bindings for rclcpp. We could have a drake_ros specific function to perform the initialization though.


drake_ros_core/src/drake_ros.cc, line 52 at r22 (raw file):

Previously, EricCousineau-TRI (Eric Cousineau) wrote…

BTW It seems like the only need for sync for the executor is context. If that's the case, then we should just let users pass in an executor instance itself?

There are no Python bindings for rclcpp executors. We could allow it in the C++ API though. Does that sound reasonable?


drake_ros_core/test/test_pub_sub.cc, line 42 at r22 (raw file):

Previously, EricCousineau-TRI (Eric Cousineau) wrote…

nit Related to below - can you assign this an explicit node name?

Done in 0309ad9316090941abbfc21955b17bb3f7584d19


drake_ros_core/test/test_pub_sub.cc, line 66 at r22 (raw file):

Previously, EricCousineau-TRI (Eric Cousineau) wrote…

nit Can you rename this variable to direct_ros or direct_node?

Done in 64f560a7aaf3af1af5087539fe762c5527efb75b


drake_ros_core/test/test_pub_sub.cc, line 66 at r22 (raw file):

Previously, EricCousineau-TRI (Eric Cousineau) wrote…

nit Unclear why this node has a name different than above.

Hmm, not sure what you mean by the above. The one in the C++ test?

Copy link
Collaborator

@EricCousineau-TRI EricCousineau-TRI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 7 files at r19, 6 of 6 files at r23.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @hidmic)


drake_ros_core/test/test_pub_sub.py, line 85 at r22 (raw file):

Previously, hidmic (Michel Hidalgo) wrote…

Argh, that's a leftover of a debugging session. Removed in 8e23c11.

OK Thanks!


drake_ros_core/src/drake_ros.cc, line 72 at r17 (raw file):

Previously, hidmic (Michel Hidalgo) wrote…

spin_all does not currently support a zero timeout. Not entirely clear why. Changed in 8836bc4 to spin_some only. We can submit a ticket upstream and try fight that API contract.

OK Odd. Perhaps a TODO for now?


drake_ros_core/src/drake_ros.cc, line 46 at r22 (raw file):

Previously, hidmic (Michel Hidalgo) wrote…

Hmm, I think that won't work for Python code. While both rclpy and rclcpp contexts eventually wrap rcl's, they are not thin wrappers nor is rclpy a set of bindings for rclcpp. We could have a drake_ros specific function to perform the initialization though.

Yeah, I prefer explicit initialization over implicit initialization, at least to start.


drake_ros_core/src/drake_ros.cc, line 52 at r22 (raw file):

Previously, hidmic (Michel Hidalgo) wrote…

There are no Python bindings for rclcpp executors. We could allow it in the C++ API though. Does that sound reasonable?

OK Yup! I'd be fine with that living as a TODO, though.


drake_ros_core/test/test_pub_sub.cc, line 66 at r22 (raw file):

Previously, hidmic (Michel Hidalgo) wrote…

Hmm, not sure what you mean by the above. The one in the C++ test?

OK Sorry, meant node name. But having it explicitly named makes it clearer.

Copy link
Collaborator Author

@hidmic hidmic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @hidmic)


drake_ros_core/src/drake_ros.cc, line 72 at r17 (raw file):

Previously, EricCousineau-TRI (Eric Cousineau) wrote…

OK Odd. Perhaps a TODO for now?

Oh, right, spin_all doesn't allow zero timeouts because that could result in an indefinite wait (if work gets queued up faster or at the exact same pace it gets dequeued). Still, opened to a ticket to foster discussion ros2/rclcpp#1825. See 74f4a80.

Copy link
Collaborator Author

@hidmic hidmic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 25 of 32 files reviewed, 1 unresolved discussion (waiting on @EricCousineau-TRI)


drake_ros_core/src/drake_ros.cc, line 46 at r22 (raw file):

Previously, EricCousineau-TRI (Eric Cousineau) wrote…

Yeah, I prefer explicit initialization over implicit initialization, at least to start.

Done in 471c8c2.


drake_ros_core/src/drake_ros.cc, line 52 at r22 (raw file):

Previously, EricCousineau-TRI (Eric Cousineau) wrote…

OK Yup! I'd be fine with that living as a TODO, though.

Done in 471c8c2

Copy link
Collaborator

@EricCousineau-TRI EricCousineau-TRI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! I think this is ready to merge - will leave it up to you or Ian.

Reviewed 7 of 7 files at r25.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved (waiting on @hidmic)


drake_ros_core/test/test_pub_sub.py, line 1 at r25 (raw file):

# Copyright 2021 Open Source Robotics Foundation, Inc.

BTW Would be better to see this as unittest, but that could be handled in follow-up PR.

In that way, we can show how to write unittests.


drake_ros_core/src/drake_ros.cc, line 72 at r17 (raw file):

Previously, hidmic (Michel Hidalgo) wrote…

Oh, right, spin_all doesn't allow zero timeouts because that could result in an indefinite wait (if work gets queued up faster or at the exact same pace it gets dequeued). Still, opened to a ticket to foster discussion ros2/rclcpp#1825. See 74f4a80.

OK Thanks!

@hidmic
Copy link
Collaborator Author

hidmic commented Dec 6, 2021

Alright, I'll move forward with this @IanTheEngineer. Thank you all for the thorough reviews.

@hidmic hidmic changed the title Add Python bindings for drake_ros_core package. Extend drake_ros_core package. Dec 6, 2021
@hidmic hidmic merged commit 46db350 into develop Dec 6, 2021
@hidmic hidmic deleted the hidmic/drake_ros_core_pybind branch December 6, 2021 16:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants