Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wholebodydynamics may not update the force measurements for more than 100ms #20

Open
S-Dafarra opened this issue Nov 5, 2019 · 15 comments

Comments

@S-Dafarra
Copy link
Collaborator

During the Maker faire Demos it was happening that reading the cartesianEndEffectorWrench ports was returning a null pointer for 10 times (trying every millisecond) in a row. The lines which do this check are https://github.com/robotology/walking-controllers/blob/master/modules/Walking_module/src/RobotHelper.cpp#L65-L85

This seemed to happen mainly when using Wi-Fi, but it did not see to be a range problem as the robot was walking right in front of the router.

@GiulioRomualdi

@S-Dafarra S-Dafarra changed the title Wholebodydynamics may not update the force measurements for more than 10ms Wholebodydynamics may not update the force measurements for more than 100ms Nov 18, 2019
@S-Dafarra
Copy link
Collaborator Author

Today we had again a similar issue, but this time the robot was wired (the Wi-Fi router was switched off) and the delay was more than 100ms. I remember back in time we had a similar issue because of the IMU. Do you think this may be caused by another sensor? @diegoferigo @traversaro

@traversaro
Copy link
Member

traversaro commented Nov 18, 2019

During the Maker faire Demos it was happening that reading the cartesianEndEffectorWrench ports was returning a null pointer for 10 times (trying every millisecond) in a row. The lines which do this check are https://github.com/robotology/walking-controllers/blob/master/modules/Walking_module/src/RobotHelper.cpp#L65-L85

I may be wrong, but if WBD is running with a frequency of 100 Hz, I think it make sense that polling on the port every 1 ms only works 1 out of 10 times, no? Considering that the communication happens over a lossy and non deterministic network, it is possible that in some cases the data is only available every 20 ms, or after 11~12 milliseconds.

@diegoferigo
Copy link
Member

I remember that debugging that Bosh IMU problem has been quite demanding, and it was solved by profiling the code. Is the problem repeatable enough?

@S-Dafarra
Copy link
Collaborator Author

Is the problem repeatable enough?

It happens especially when we have demos 😅

@diegoferigo
Copy link
Member

Best period to debug this problem than :)

@S-Dafarra
Copy link
Collaborator Author

S-Dafarra commented Nov 18, 2019

I was scanning quickly the code and I spotted this line:

https://github.com/robotology/codyco-modules/blob/master/src/devices/wholeBodyDynamics/WholeBodyDynamicsDevice.cpp#L1734

Is this contacting the yarpserver?

@traversaro
Copy link
Member

Today we had again a similar issue, but this time the robot was wired (the Wi-Fi router was switched off) and the delay was more than 100ms. I remember back in time we had a similar issue because of the IMU. Do you think this may be caused by another sensor? @diegoferigo @traversaro

How do you know that the data was not read for 100 ms?

In general, such a delay could be due to the WBD thread not being scheduled or not running for a lot of time for some reason (in the case of IMU, the read method was blocking for a long time) or because something in the YARP network communication is not working fine.

@traversaro
Copy link
Member

I was scanning quickly the code and I spotted this line:

https://github.com/robotology/codyco-modules/blob/master/src/devices/wholeBodyDynamics/WholeBodyDynamicsDevice.cpp#L1734

Is this contacting the yarpserver?

I have no idea, but that part of code is a useless leftover, feel free to remove it as for sure it creates more harm then benefit.

@traversaro
Copy link
Member

Is this contacting the yarpserver?

Apparently not, but it is using some kind of mutex so it is a good idea to remove it in any case: https://github.com/robotology/yarp/blob/cbd800e618d89ae5790af8d75148f02daf203d29/src/libYARP_OS/src/PortCore.cpp#L1473 .

@S-Dafarra
Copy link
Collaborator Author

How do you know that the data was not read for 100 ms?

The walking tried to read from the external wrench ports for 100 times, once every millisecond, but it was receiving NULL from both the ports all the time.

In general, such a delay could be due to the WBD thread not being scheduled or not running for a lot of time for some reason (in the case of IMU, the read method was blocking for a long time) or because something in the YARP network communication is not working fine.

We know that FT boards are currently having firmware issues (robotology/icub-tech-support#854) experiencing crashing once in a while, so I was hoping to run that profiling again.

Apparently not, but it is using some kind of mutex so it is a good idea to remove it in any case: https://github.com/robotology/yarp/blob/cbd800e618d89ae5790af8d75148f02daf203d29/src/libYARP_OS/src/PortCore.cpp#L1473 .

I saw the same, but PortCore seems disappeared in devel.

@diegoferigo
Copy link
Member

At that time I had to develop a ThreadStats class robotology/yarp#1230, that for lack of time was never merged upstream (and we never needed it again since).

As per robotology/yarp#1230 (comment), I would suggest to either use more advanced libraries like Celtoys/Remotery, or embed them in a new class to propose upstream.

@traversaro
Copy link
Member

How do you know that the data was not read for 100 ms?

The walking tried to read from the external wrench ports for 100 times, once every millisecond, but it was receiving NULL from both the ports all the time.

So this line https://github.com/robotology/walking-controllers/blob/f70d2646bb564e4aea607c8f9463b7aec390d04c/modules/Walking_module/src/WalkingModule.cpp#L579 was modified to have 100? This is not related to the issue itself, but I noticed that the controller waits until the feedback is available, that is a bit a dangerous strategy. Have you considered using the old values without stopping the controller, and actually stop the controller when the maximum timeout is reached?

@S-Dafarra
Copy link
Collaborator Author

So this line https://github.com/robotology/walking-controllers/blob/f70d2646bb564e4aea607c8f9463b7aec390d04c/modules/Walking_module/src/WalkingModule.cpp#L579 was modified to have 100? This is not related to the issue itself, but I noticed that the controller waits until the feedback is available, that is a bit a dangerous strategy. Have you considered using the old values without stopping the controller, and actually stop the controller when the maximum timeout is reached?

Yes, exactly, it was a quick test but that error got triggered anyways. Regarding the safety issues, we can think about it.

@S-Dafarra
Copy link
Collaborator Author

As per robotology/yarp#1230 (comment), I would suggest to either use more advanced libraries like Celtoys/Remotery, or embed them in a new class to propose upstream.

Do you think we would need all this machinery for a first study?

@traversaro
Copy link
Member

Do you think we would need all this machinery for a first study?

No, probably just adding a double variable in the WBD in which you save the last end of execution, and you can check from run to run if the difference of execution time is not so different from the period.

@traversaro traversaro transferred this issue from robotology-legacy/codyco-modules Feb 1, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants