Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ergoCub 1.3 S/N:002 – l_shoulder_pitch goes into overheating fault #2007

Open
mebbaid opened this issue Jan 7, 2025 · 14 comments
Open
Assignees
Labels
ergoCub 1.3 S/N:002 ergoCub1.3 platform

Comments

@mebbaid
Copy link

mebbaid commented Jan 7, 2025

Robot Name πŸ€–

ergoCub 1.3 S/N:002

Request/Failure description

Image

Detailed context

The error was during a session of teleoperation. The robot was not walking

Additional context

No response

How does it affect you?

No response

@github-actions github-actions bot changed the title l_shoulder_pitch goes into overheating fault ergoCub 1.3 S/N:002 – l_shoulder_pitch goes into overheating fault Jan 7, 2025
@github-actions github-actions bot added the ergoCub 1.3 S/N:002 ergoCub1.3 platform label Jan 7, 2025
@mebbaid
Copy link
Author

mebbaid commented Jan 8, 2025

I turned on the robot this morning, and before moving any joint the error occured again. I got an additional context with temp in the negative. See

	195.915567	ERROR			from BOARD 10.0.1.2 (left_arm-eb2-j0_1) time=550s 816m 404u :  MC: overheating. Temperature hardware limit exceeded. The motor has been turned off to prevent it from being damaged by overheating. (Joint=l_shoulder_pitch (NIB=0), Raw_temperature_value=-886)

cc @S-Dafarra

@S-Dafarra
Copy link

I believe there could be an issue with the sensor

cc @MSECode @valegagge

@S-Dafarra
Copy link

Here a log with several overheating issues on:

  • l_shoulder_pitch
  • r_shoulder_pitch
  • r_hip_yaw

log_overheating.zip

@MSECode
Copy link

MSECode commented Jan 8, 2025

I believe there could be an issue with the sensor

cc @MSECode @valegagge

Yes, that might be. If it is continuing to happen I'll suggest to ask proto to check the sensors. There might be some issue on the cable. Let's see that first and then you know how to proceed.
Another thing, I don't remember if those joint has the i2c extender connected to the TDB to remove the disturbances.

@S-Dafarra
Copy link

Another thing, I don't remember if those joint has the i2c extender connected to the TDB to remove the disturbances.

The shoulders for sure no. The hip yaw I don't remember, but I don't think so.

BTW, what is the formula to convert from raw to absolute value? Just to understand if the value is weird or not

@valegagge
Copy link
Member

Hi @S-Dafarra ,

BTW, what is the formula to convert from raw to absolute value? Just to understand if the value is weird or not

In icub-tech Documentation you can find the raw values we used for signaling error conditions.

It doesn't exist a precise formula from raw to celsius degree because the trasformation is done by a lookup table (that depends from the sensor type PT100 o PT1000).

In the code we aproximate it by a function. If you are interested in it I can give you the poiter to the that functon.

BTW I'd like to understand why you need

understand if the value is weird or not

just to help you in the best way.

@S-Dafarra
Copy link

BTW I'd like to understand why you need

understand if the value is weird or not

just to help you in the best way.

If you look at the log in #2007 (comment), there are several error with positive values, usually in the order of 300. I would like to understand from the error if it can be safely ignored or not. For example, if the temperature is over 100deg celsius, and I can touch the part, then I can assume it is a false positive. If instead the temperature below 100deg, I might also think that the error is legit (just to avoid issues like #1986)

@mebbaid
Copy link
Author

mebbaid commented Jan 9, 2025

I turned on the robot this morning and the error is recurring, this time the raw value is positive.

	612.509301	ERROR			from BOARD 10.0.1.2 (left_arm-eb2-j0_1) time=948s 560m 675u :  MC: overheating. Temperature hardware limit exceeded. The motor has been turned off to prevent it from being damaged by overheating. (Joint=l_shoulder_pitch (NIB=0), Raw_temperature_value=344)

@S-Dafarra
Copy link

Regarding the arms, I also realized that I did not downgrade the version of the 2foc, so the overheating issues might also be related to the current control. This does not hold for the legs

@valegagge
Copy link
Member

asap we'll provide you the lookup table to transform the raw value to degree. Then we'll update the diagnostic with the Celsius degree (even if it is a very simple task, there are some dependency sw issues. )

@valegagge
Copy link
Member

asap we'll provide you the lookup table to transform the raw value to degree. Then we'll update the diagnostic with the Celsius degree (even if it is a very simple task, there are some dependency sw issues. )

icub-tech-iit/documentation#388

cc @MSECode Thanks!!!

@valegagge
Copy link
Member

Hi @MSECode ,
if I'm not wrong, the values reported when the joints went in fault are ~30 degrees. So there is something that doesn't work.

I guess we haven't dumped any temperature values because the issue occurred before you started your experiment. Is it correct @S-Dafarra ?

@S-Dafarra
Copy link

I guess we haven't dumped any temperature values because the issue occurred before you started your experiment. Is it correct @S-Dafarra ?

No, sorry, I don't have data

@MSECode
Copy link

MSECode commented Jan 10, 2025

The cause of the problem is for sure related to the reading problems on the i2C of the TDB that is shown in the log message posted here: #2007 (comment). As a matter of fact the temperature of -886 is indeed the error value that means that the i2C is not receiving data continuously for more than 10 seconds. Check this section of the documentation https://icub-tech-iit.github.io/documentation/temperature_sensors/software/dataflow/#error-handling
Then, the fact that one the other log message we are reading reasonable temperatures is due to the fact that the overheating error on the 2FOC, like all other errors managed by the 2FOC, gets cleaned only if: you switch off the board or if you require the FORCE_IDLE mode. Thus, since the IDLE has not been requested on the joint, even if the TDB restarted to read correctly, the joint remained in fault and it continues to stream the error message every 5 seconds.
The, the fact that at the beginning the i2C was unable to receive data is likely due to a cabling problem, like a damaged soldering or one of the cable that in some position is not making good contact or that is pinched. This cause the error of not reading at the beginning, while later we streamed normal temperatures.
Thus, I'll suggest to check the TDBs of these joints: #2007 (comment) and see if there're issue with the soldering and/or the cables.
Anyways, in those condition if you desired to try to use the joint, just require the IDLE mode and if the error does not get risen anymore it means that it re-entered. However, having a cable that does not make a stable contact it is not a desired working condition. We should check how to avoid that.

cc: @valegagge

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ergoCub 1.3 S/N:002 ergoCub1.3 platform
Projects
Status: Triage
Development

No branches or pull requests

5 participants