Multiple publishers writing on the same topic cause segfault #8

randaz81 · 2018-06-27T21:15:08Z

Let's consider the following case:

a process spawning multiple threads (i.e. multiple device wrappers inside yarprobotinterface).
each of these threads instantiate its own yarp::os::Node and its own yarp::os::Publisher
all these publishers write on a topic with the same name.

The process will segfault after a random amount of time, probably due to a race condition inside write() method.

A gist containing a snippet of code to reproduce the bug is provided below.

randaz81 · 2018-06-27T21:31:36Z

https://gist.github.com/randaz81/eabcdb90c0598b802f74daee94d3397f

Tobias-Fischer · 2018-07-16T13:44:41Z

Just a thought, but why should each thread instantiate its own Node? Also, in your code snip, there is no correspondence between node[i] and pub2[i], as there is no "registration" of the Publisher with the Node (at least as far as I know).

In ROS, each module calls ros::init() exactly once, and I guess the same makes sense for YARP's Node, doesn't it? Then the question is whether multiple Publishers still crash under one and the same node.

drdanz · 2018-07-16T15:51:23Z

@Tobias-Fischer YARP tries to stay as generic as possible, in order to allow integration with different systems... ROS is just one system, the fact that ROS supports only one node, does not mean that other system will work in the same way, limiting it to one single node adds a limitation in the systems that we will be able to integrate in the future.

Tobias-Fischer · 2018-07-16T15:56:46Z

Sure. If YARP wants to be that generic though and at the same time inter-operable with ROS, a Node should be associated with a Publisher though, isn't it? Currently that is not the case, contrary to the code snippet by @randaz81 which seemingly tries to create one Node per Publisher (I haven't run the snippet so I am not sure what happens currently before the crash occurs with that respect). Feel free to correct me if I'm wrong though, as I said it's just a thought :)

drdanz · 2018-07-16T16:30:01Z

Actually I haven't tested/reviewed @randaz81 code, but at a very quick check, this is likely to be a bug:

    yarp::os::Node* node[num];
    yarp::os::Publisher<geometry_msgs_Point> pub2[num];
    yarp::os::Time::delay(1.0);
    for (int i=0; i<num; i++)
    {
        char name [50];
        sprintf (name,"/testNode%d",i);
        node[i] = new yarp::os::Node (name);
        pub2[i].topic("/pubTest2");
    }

The Nodes should definitely be created before the Publishers. When the Publisher is created, there is no existing Node.

barbalberto · 2018-07-17T07:22:14Z

@Tobias-Fischer our current use case is a wrapper which has to publish data on a specific topic.
Since iCub wrappers are split into parts, we have different wrappers which, independently one from the others, have to publish on the same topic.

If we use the same node for all of them, then we will have many publishers with the same node and same topic, and that's not possible. So we identify different parts using different nodes.

The point here is also to discuss what the best practise should be. Is this the best we can do, or there are other ideas which may be better?

Tobias-Fischer · 2018-07-17T10:33:18Z

Hi @barbalberto,
I think my concern is quite well described in #18. I find the behavior that "YARP is supposed to create a port /topic@/node nested to the latest (temporal order) instantiated Node" not very intuitive, and prone to bugs as shown here and in that issue report.

My suggestion was to pass a reference of the Node to the Publisher (either at instantiation or when calling topic), which would avoid implicit requirements such as the need of creating the Node before the Publisher (see @randaz81 code and @drdanz comment) and also fix #18. This would be generic, however it would break the API.

But again, I am not very familiar with the topic so this is just an idea and I might totally miss the point ..

drdanz · 2018-07-20T12:14:48Z

I think @Tobias-Fischer proposal makes a lot of sense, we might actually consider reviewing and deprecating the current api (I haven't completely understood the differences between passing the topic to the constructor, compared to passing it in topic() and the reason for the various open()) in favour of something like

yarp::os::Publisher<T>::topic(const std::string& topic, const yarp::os::Node& node);
yarp::os::Subscriber<T>::topic(const std::string& topic, const yarp::os::Node& node);

I wonder if it is actually possible (and if it makes sense) for a publisher to publish on 2 different nodes.

randaz81 · 2018-07-20T13:58:58Z

I definitely like this solution and the API revision here proposed.
About the last question, I currently can't find any reasonable example which could motivate the use of a single publisher/subscriber attached to multiple nodes.

barbalberto · 2018-07-24T12:41:25Z

Another idea I had time ago was to completely reverse the logic.
Conceptually the Node is an executable and the topic is not a class like a yarp port but, as the name says, simply a topic... something to talk about 😃

The idea would be to create the Node as a singleton, so the first one using a Node will create it. (I mean nodes with the same name. Different names, different instances).

If an executable needs to publish on the same topic from different threads, it can recall the same Node without having to create another one each time. Furthermore it does not need to care if that node was already created by someone else or not. This will solve the problem we have now regarding the plugins. Any plugin can create a Node object with the same name, but only one will actually instantiated.

Then one can create a topic attached to that Node, like

Node n("myNode");
n.topic("myTopic");      // choose the syntax you like the most, that's an example

This will create a topic and attach it to the Node object, which will manage its life cycle.
Here again, if many objects try to write to the same topic, they will simply get the reference to the already existing topic object.

In this way, if I have many robot wrappers writing all on the same topic /analog@/robot, only one node and one port will actually be created and shared among all plugins.

Basically we can run the two lines above as many times as we want, the result will always be one node and one port.

This will reduce resources consumption and also solve the problem of having many writers on the same topic from the root.

drdanz · 2018-07-27T15:42:09Z

I don't agree with making Node a singleton, because, as I said already,

YARP tries to stay as generic as possible, in order to allow integration with different systems... ROS is just one system, the fact that ROS supports only one node, does not mean that other system will work in the same way, limiting it to one single node adds a limitation in the systems that we will be able to integrate in the future.

But actually, reading the description, I believe you don't actually mean a "singleton", but something that is "unique" depending on the name, is this correct? In this case I totally agree, but I believe that we might have something similar already, see yarp::os::Nodes documentation, even though I have no idea about how it should be used...

In Node constructor:

https://github.com/robotology/yarp/blob/62d24814a32a7fb4f30c8c0eeaf3247111178853/src/libYARP_OS/src/Node.cpp#L550-L563

I believe that here, instead of nodes.addExternalNode(rname, *this); it should check if a Node nome with the same name exists, and eventually "share" the internal port with the one of the already existing node.

barbalberto · 2018-07-30T08:16:27Z

I believe you don't actually mean a "singleton", but something that is "unique" depending on the name, is this correct?

Yes, that's what I meant. You are right, singleton is not the right word, sorry.

AFAIK there is no strong relationship right now between Node and topic. In the code you posted, it looks like the Node (name) is set as the active one. Then, when creating a topic, the new topic uses the current active Node.
I fear there could be concurrency issue here, that's why I proposed to have the Node as a topic container.

I agree on beeing as generic as possible, however the Node is an interface to ROS framework, it embedds parsing of ROS core messages and other ROS specificity so I don't think it will be re-usable with other frameworks anyway.

traversaro · 2018-08-10T13:03:46Z

Relevant discussion (in that case the context is ROS2 and Gazebo plugins) about single node/multiple nodes in a process: ros-simulation/gazebo_ros_pkgs#797 .

damn1 mentioned this issue Sep 7, 2023

Document/clarify that multiple yarp::os::Node objects are allowed for process #17

Open

1 task

traversaro mentioned this issue Jan 9, 2020

Multiple analog sensor ROS publishers robotology/yarp#2172

Merged

randaz81 transferred this issue from robotology/yarp Sep 7, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multiple publishers writing on the same topic cause segfault #8

Multiple publishers writing on the same topic cause segfault #8

randaz81 commented Jun 27, 2018

randaz81 commented Jun 27, 2018

Tobias-Fischer commented Jul 16, 2018 •

edited

Loading

drdanz commented Jul 16, 2018

Tobias-Fischer commented Jul 16, 2018

drdanz commented Jul 16, 2018

barbalberto commented Jul 17, 2018 •

edited

Loading

Tobias-Fischer commented Jul 17, 2018

drdanz commented Jul 20, 2018

randaz81 commented Jul 20, 2018

barbalberto commented Jul 24, 2018 •

edited

Loading

drdanz commented Jul 27, 2018

barbalberto commented Jul 30, 2018

traversaro commented Aug 10, 2018

Multiple publishers writing on the same topic cause segfault #8

Multiple publishers writing on the same topic cause segfault #8

Comments

randaz81 commented Jun 27, 2018

randaz81 commented Jun 27, 2018

Tobias-Fischer commented Jul 16, 2018 • edited Loading

drdanz commented Jul 16, 2018

Tobias-Fischer commented Jul 16, 2018

drdanz commented Jul 16, 2018

barbalberto commented Jul 17, 2018 • edited Loading

Tobias-Fischer commented Jul 17, 2018

drdanz commented Jul 20, 2018

randaz81 commented Jul 20, 2018

barbalberto commented Jul 24, 2018 • edited Loading

drdanz commented Jul 27, 2018

barbalberto commented Jul 30, 2018

traversaro commented Aug 10, 2018

Tobias-Fischer commented Jul 16, 2018 •

edited

Loading

barbalberto commented Jul 17, 2018 •

edited

Loading

barbalberto commented Jul 24, 2018 •

edited

Loading