Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Official script to convert your dataset to the kitti dataset format #107

Open
sarimmehdi opened this issue Mar 23, 2020 · 6 comments
Open

Comments

@sarimmehdi
Copy link

Hello. I really like your dataset and I was hoping you could provide a script to convert your data to the kitti format. For the moment, I found this which sort of helps: https://github.com/Yao-Shao/Waymo_Kitti_Adapter

But the issue is that because you use different reference axis, in both global and vehicle frame, the intrinsic matrix looks different. In kitti, the intrinsic matrix looks like:

[f^2_u,  0,      c^2_u,    0;
 0,      f^2_v,  c^2_v,    0;
 0,      0,      1,        0]

But, in the above code I man using, the author apparently has to shift the columns. As a result, the intrinsic matrix looks quite different. Here is the piece of code that is used to do that (adapter.py script from the above GitHub repo):

    def save_calib(self, frame, frame_num):
        """ parse and save the calibration data
                :param frame: open dataset frame proto
                :param frame_num: the current frame number
                :return:
        """
        fp_calib = open(CALIB_PATH + '/' + str(frame_num).zfill(INDEX_LENGTH) + '.txt', 'w+')
        waymo_cam_RT=np.array([0,-1,0,0,  0,0,-1,0,   1,0,0,0,    0 ,0 ,0 ,1]).reshape(4,4)
        print("WAYMO_CAM_RT:")
        print(waymo_cam_RT)
        camera_calib = []
        R0_rect = ["%e" % i for i in np.eye(3).flatten()]
        Tr_velo_to_cam = []
        calib_context = ''

        for camera in frame.context.camera_calibrations:
            tmp=np.array(camera.extrinsic.transform).reshape(4,4)
            tmp=np.linalg.inv(tmp).reshape((16,))
            Tr_velo_to_cam.append(["%e" % i for i in tmp])

        for cam in frame.context.camera_calibrations:
            tmp=np.zeros((3,4))
            tmp[0,0]=cam.intrinsic[0]
            tmp[1,1]=cam.intrinsic[1]
            tmp[0,2]=cam.intrinsic[2]
            tmp[1,2]=cam.intrinsic[3]
            tmp[2,2]=1
            print("BEFORE MULTIPLYING WITH WAYMO_CAM_RT:")
            print(tmp)
            tmp=(tmp @ waymo_cam_RT)
            print("AFTER MULTIPLYING WITH WAYMO_CAM_RT:")
            print(tmp)
            tmp=list(tmp.reshape(12))
            tmp = ["%e" % i for i in tmp]
            camera_calib.append(tmp)

        for i in range(5):
            calib_context += "P" + str(i) + ": " + " ".join(camera_calib[i]) + '\n'
        calib_context += "R0_rect" + ": " + " ".join(R0_rect) + '\n'
        for i in range(5):
            calib_context += "Tr_velo_to_cam_" + str(i) + ": " + " ".join(Tr_velo_to_cam[i]) + '\n'
        fp_calib.write(calib_context)
        fp_calib.close()

Here is an example output of the above code to illustrate the issue:

WAYMO_CAM_RT:
[[ 0 -1  0  0]
 [ 0  0 -1  0]
 [ 1  0  0  0]
 [ 0  0  0  1]]
BEFORE MULTIPLYING WITH WAYMO_CAM_RT:
[[2.08309121e+03 0.00000000e+00 9.57293829e+02 0.00000000e+00]
 [0.00000000e+00 2.08309121e+03 6.50569793e+02 0.00000000e+00]
 [0.00000000e+00 0.00000000e+00 1.00000000e+00 0.00000000e+00]]
AFTER MULTIPLYING WITH WAYMO_CAM_RT:
[[ 9.57293829e+02 -2.08309121e+03  0.00000000e+00  0.00000000e+00]
 [ 6.50569793e+02  0.00000000e+00 -2.08309121e+03  0.00000000e+00]
 [ 1.00000000e+00  0.00000000e+00  0.00000000e+00  0.00000000e+00]]

As you can see, the third column becomes the first one, the first one becomes the second one and the second one becomes the third one (with a minus sign added in to the two columns with single entries). I decided to use this approach and apparently this doesn't even give the right 3D bounding boxes as can be seen here:
Yao-Shao/Waymo_Kitti_Adapter#3

I was hoping you could shed some light on this. This is because the camera calibration matrix needs to have the same format as in kitti and this apparently doesn't do that and I cannot figure out how.

@peisun1115
Copy link
Contributor

I have not yet read Yao-Shao's code.

You can also try our LiDAR->Camera projection lib. https://github.com/waymo-research/waymo-open-dataset/blob/master/third_party/camera/ops/camera_model_ops_test.py

You can find example of camera projection in this lib as well.

Hopefully this helps.

@sarimmehdi
Copy link
Author

Hi. Thank you for your reply. I looked at your code but I cannot see how you construct the intrinsic and extrinsic parameter matrix. In your code, you take the extrinsic and intrinsic parameters as a list input:

image_points_t = py_camera_model_ops.world_to_image(
          extrinsic, intrinsic, metadata, camera_image_metadata, global_points)

I decided to follow your code to its C++ implementation (camera_model_ops.cc) but even there I can't see where you create the matrices. Can you please help me understand how you do it? Converting your code to Kitti will be really useful as many different codebases that do 3D object detection and depth estimation use the KITTI format.

@peisun1115
Copy link
Contributor

I agree with you that a kitti converter would be useful. You may try one from Yao-Shao for now. Hopefully he can help you to resolve the issue in his code.

Our projection code is here: https://github.com/waymo-research/waymo-open-dataset/blob/master/third_party/camera/camera_model.h

@sarimmehdi
Copy link
Author

Hi. I looked at your code and you seem to be storing the intrinsic and extrinsic parameter in the same way as in kitti. I did that but it is still not correct as I am unable to get 3D bounding boxes at the correct position. Do you think one would need to do special manipulation in the matrices to make it compatible with the kitti axes?

@peisun1115
Copy link
Contributor

we are just following the standard definition of the camera intrinsics. I think you can try to use our projection code directly first (this makes sure you read all information correctly) and then debug the conversion code.

I posted an example code here. #24 (comment)

@caizhongang
Copy link

You may check out my toolkit that adapts the differences between the two datasets, with visualization confirming that the tool works properly. It also provides a tool to convert KITTI-format prediction results back to Waymo-format.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants