Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Could not multiply by non-number #2

Open
HRItdy opened this issue May 7, 2024 · 23 comments
Open

Could not multiply by non-number #2

HRItdy opened this issue May 7, 2024 · 23 comments

Comments

@HRItdy
Copy link

HRItdy commented May 7, 2024

Hi Enyen,

Thanks for sharing the code! I have trained a kovis1 model and use the command rob.servo('pick_tube', 10, [0.01, 0.01, 0.01, 0.05, 0, 0], [0.1, 5]) to test it.
I printed the velocity and orientaion:

Velocity: [tensor(0.0022), tensor(0.0024), tensor(0.0024), tensor(-3.7409e-06), 0, 0]
Orientation: <Orientation: RV(-1.086, -0.000, -0.000)>

Are they reasonable? Then an error raised:

Traceback (most recent call last):
  File "/home/lab_cheem/miniforge3/envs/kovis1-2.7/lib/python3.8/threading.py", line 932, in _bootstrap_inner
    self.run()
  File "/home/lab_cheem/miniforge3/envs/kovis1-2.7/lib/python3.8/threading.py", line 870, in run
    self._target(*self._args, **self._kwargs)
  File "/home/lab_cheem/catkin_ws/src/WP2/kovis1/inference_oja.py", line 204, in _watch_servo
    self.rob.speedl_tool(vec, ACCE, t)
  File "/home/lab_cheem/.local/lib/python3.8/site-packages/urx/robot.py", line 176, in speedl_tool
    v = pose.orient * m3d.Vector(velocities[:3])
  File "/home/lab_cheem/.local/lib/python3.8/site-packages/math3d/vector.py", line 455, in __rmul__
    raise utils.Error('__rmul__ : Could not multiply by non-number')
math3d.utils.Error: __rmul__ : Could not multiply by non-number

This seems to be an internal issue in python urx. I searched this but found no solution. Did you encounter this problem before? Any suggestion is appreciated. Thanks!

Regards,
Daiying

@enyen
Copy link
Owner

enyen commented May 7, 2024

indeed it seems to be an issue with urx and math3d.
Maybe you could try to revert math3d to an earlier version https://gitlab.com/morlin/pymath3d/-/tags

@HRItdy
Copy link
Author

HRItdy commented May 7, 2024

Thanks for your reply.

I tried all the other versions of math3d and kept getting this issue:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/lab_cheem/catkin_ws/src/WP2/kovis1/inference_oja.py", line 161, in servo
    self.move_tcp_perpendicular()
  File "/home/lab_cheem/catkin_ws/src/WP2/kovis1/inference_oja.py", line 95, in move_tcp_perpendicular
    tcp_pose = self.rob.getl()
  File "/home/lab_cheem/.local/lib/python3.8/site-packages/urx/robot.py", line 220, in getl
    return t.pose_vector.get_array().tolist()
AttributeError: 'numpy.ndarray' object has no attribute 'get_array'

It seems to be the issue of the numpy? I revert the numpy version but still get this. Do you remember the configuration you used before? Thanks!

@HRItdy
Copy link
Author

HRItdy commented May 7, 2024

Never mind Enyen, I found the original urx code is different with this.
return t.pose_vector.tolist()

I will reinstall the urx. Thanks!

@HRItdy
Copy link
Author

HRItdy commented May 7, 2024

Hi Enyen,

For now when I run the servo script, the robot will automatically transform to the position as shown below:
eff2cd4aa0091a48183af17f0d08801

I changed the position of the object in pybullet but the gripper still aims at the ground. Regarding in the final demo the gripper should to be horizontal, should I change the urdf to make the gripper be horizontal? Or there is some other easier way to achieve this? And if we need to update the position of the robot in the script, or it will be automatically updated according to the training dataset? Thanks!

@enyen
Copy link
Owner

enyen commented May 7, 2024

self.move_tcp_perpendicular()

The code move_tcp_perpendicular everytime before running the servo, so you could modify this line according to your application.

@PrithvirajRay
Copy link

Hi Enyen, thanks for the response, however this is at the inference stage right? Wouldnt we want to train the robot in that horizontal end position from the training data generation phase itself? Basically we want the robot to grab the tube horizontally so in the simulator during training data generation also we need the robot to move that similar way right?

@HRItdy
Copy link
Author

HRItdy commented May 8, 2024

Hi Enyen,

I generated 7000 records to train the model, and the result seems not good enough.
In some test image, the distribution of the keypoints should be good (but several kps fall into the background):
0046_0 49_0 69

But in some other images, the distribution is not good, and the object is not completely segmented out:
0041_0 50_0 69
0040_0 00_0 69

And this is the final loss. I guess it's not good enough:
Screenshot from 2024-05-08 20-39-25

Is there any suggestion to improve the performance of the model? Is this because the object is too small?
Thanks!

@HRItdy
Copy link
Author

HRItdy commented May 9, 2024

Hi Enyen,

I have sent out the meeting link through gmail. Pls let me know if you didn't receive this.

By the way, is there by any change you encountered no realsense device can be detected issue? It seems due to the driver. Is it okay if we directly reinstall the driver? Will this destroy some features you made before? Thanks!

@enyen
Copy link
Owner

enyen commented May 9, 2024

Hi Enyen, thanks for the response, however this is at the inference stage right? Wouldnt we want to train the robot in that horizontal end position from the training data generation phase itself? Basically we want the robot to grab the tube horizontally so in the simulator during training data generation also we need the robot to move that similar way right?

For data generation, you have to design your task in data cfg yaml according to your application.
For inference, place the gripper accordingly ie. horizontally in your case.

Hi Enyen,

I generated 7000 records to train the model, and the result seems not good enough. In some test image, the distribution of the keypoints should be good (but several kps fall into the background): 0046_0 49_0 69

But in some other images, the distribution is not good, and the object is not completely segmented out: 0041_0 50_0 69 0040_0 00_0 69

And this is the final loss. I guess it's not good enough: Screenshot from 2024-05-08 20-39-25

Is there any suggestion to improve the performance of the model? Is this because the object is too small? Thanks!

If the training data is generated correctly, try to tune the hyperparams in the training cfg yaml.
Maybe using a plain background for debugging.

Hi Enyen,

I have sent out the meeting link through gmail. Pls let me know if you didn't receive this.

By the way, is there by any change you encountered no realsense device can be detected issue? It seems due to the driver. Is it okay if we directly reinstall the driver? Will this destroy some features you made before? Thanks!

Mostly connection issues for me.
But reinstalling the driver should be fine.

@HRItdy
Copy link
Author

HRItdy commented May 10, 2024

Hi Enyen,

Really thanks for your help! I move the object closer to the arm. But after doing this, the base is included into the view. Does this matter?
image

And in this setting the arm also seems to be a little extreme. Should I move it away further? And this may cause the object to fall on the edge of the base, does this matter? Really appreciate your help!

Regards,
Daiying

@HRItdy
Copy link
Author

HRItdy commented May 10, 2024

Move away further will result this:

MicrosoftTeams-image (36)

The object will fall on the edge of the base. Does this matter? Thanks!

@enyen
Copy link
Owner

enyen commented May 10, 2024

use cam_clip to limit camera range and avoid seeing robot base

cam_clip: [0.03, 0.50]

@HRItdy
Copy link
Author

HRItdy commented May 11, 2024

Thanks for your reply Enyen!
Based on your suggestion, I tried to paste the image on a plain background:
Screenshot from 2024-05-11 20-10-38

This is the saved training image:
train

And this is the loss, seems still very high, I will adjust the hyperparameters to reduce it
Screenshot from 2024-05-11 20-36-51

Is there any obvious issue in this procedure now? In addition, now there is random noise on the image. Should I also disable the noise for now?
0005_0 00_0 69

Thanks!

@HRItdy
Copy link
Author

HRItdy commented May 12, 2024

Hi Enyen,

I reduced the number of keypoints to 6 and adjust concentrate to -30 and inside to 60. Now there are 2 kps falling into the object and move with it:
0330_0 09_0 66
0331_0 00_0 66
0332_0 02_0 70
0333_0 03_0 68
0343_0 09_0 68
0344_0 15_0 68
0345_0 10_0 67
0346_0 00_0 67
But the loss still seems not good enough:
Screenshot from 2024-05-12 16-24-42

I would like to ask what's the reasonable range of the loss? Should I add the background back now and retrain the model, or continue to adjust the parameters to reduce the loss? Thanks!

@enyen
Copy link
Owner

enyen commented May 12, 2024

The speed loss does not look good.
Is the training data generated correctly? You can check by verifying the poses in first step of all episodes are correct.

Yes, you should continue to try different settings and also on robot.

@HRItdy
Copy link
Author

HRItdy commented May 12, 2024

Thanks for your reply Enyen!

I sampled several initial steps:
00001_00
00002_00
00000_00

They have the similar relative positions (but has some random orientation). Is this correct?

And I noticed that in one episode, seems like the camera has some translation in different steps:

00002_00
00002_01
00002_02
00002_03
00002_04

Should the camera be fixed in one episode? Is this the reason why the speed loss is not good enough?

Thanks!

@enyen
Copy link
Owner

enyen commented May 13, 2024

yes, camera pose is randomized in every images.
You can turn it off at

cam_rot_rag: [0.1, 0.1, 0.1]

@HRItdy
Copy link
Author

HRItdy commented May 15, 2024

Thanks Enyen!

I increased the dataset size and adjusted the parameters, the result is shown as follows:
Screenshot from 2024-05-15 12-16-50

Is this loss good enough? Should the ideal speed loss be smaller than 0.1? Thanks!

@HRItdy
Copy link
Author

HRItdy commented May 15, 2024

Hi Enyen, please forget the problem. I tested this model on the real robot, and the performance is good (although bad performance from some specific initial positions, I will continue to adjust the hyperparameter to improve this). Really thanks for your help!

@HRItdy
Copy link
Author

HRItdy commented Jun 24, 2024

Hi Enyen,

Sorry for the bother, I'm trying to understand the code, and visualize the keypoints in the inference phase. I found there seems to be some inconsistency between the inference and training phases.

In the training, the inference is set to be False

def __init__(self, num_input, num_keypoint, growth_rate, block_cfg, drop_rate, kper, inference=False):

So the encoder will take one single image as input:
keypL0 = enc(inL0)

But in the inference phase, parameter inference is set to be True, so it will require images from left and right camera simultaneously:

kper, True).cuda()

And when I change it to False, the output keypoints have one dimension all 0:
image
If I keep the inference as True and input the right and left images simultaneously like:

kp = getattr(self, 'kp_{}'.format(action))([infraL.cuda(), infraR.cuda()])

the output keypoints only have one dimension:
image

I would like to ask whether inference should be True. If so how to use this one dimension vector to visualize the keypoints. Thanks! Sorry for any bother!

@enyen
Copy link
Owner

enyen commented Jun 24, 2024

When inference=True, keypoints from left and right images are concatenated:

return torch.cat((kpl[0], kpr[0]), dim=1)

You can seperate the enc and cvt in init_network():

- setattr(self, 'net_servo_{}'.format(net), torch.nn.Sequential(enc, cvt))
+ setattr(self, 'net_enc_{}'.format(net), enc)
+ setattr(self, 'net_cvt_{}'.format(net), cvt)

- getattr(self, 'net_servo_{}'.format(net)).eval()
+ getattr(self, 'net_enc_{}'.format(net)).eval()
+ getattr(self, 'net_cvt_{}'.format(net)).eval()

In _watch_servo():

- vec, speed = getattr(self, 'net_servo_{}'.format(action))([infraL.cuda(), infraR.cuda()])
+ kps = getattr(self, 'net_enc_{}'.format(action))([infraL.cuda(), infraR.cuda()])
+ vec, speed = getattr(self, 'net_cvt_{}'.format(action))(kps)
+ kpl, kpr = kps.squeeze().chunk(chunks=2, dim=1)
+ vis_kp(kpl, kpr)  # todo

Keypoints are flatten:

cmag], dim=1).view(n, 3 * self.c), \

into [h1, w1, m1, ..., hK, wK, mK]

@HRItdy
Copy link
Author

HRItdy commented Jun 25, 2024

Really thanks for your reply! Yes the output is the result after I separated the enc and cvt. Sorry I have one more question about the output. What do h, w, m represent? In the paper the output of the keypointer is a 32*32 feature map with K channels, but in the code the output is simplified to the coordination of keypoints hi, wi. Do I understand correctly? Thanks!

@enyen
Copy link
Owner

enyen commented Jul 2, 2024

h and w make up a points at the first and second axis of the 2D feature map; m is the magnitude or intensity of the point.

def forward(self, x):

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants