Error rendering depth image #14

alpha571 · 2024-11-07T13:52:58Z

Dear tsattler and v-pnk,
When I use the ply model and camera pose file you shared to render depth images with Open3D, I'm having trouble getting depth images that match yours in scale. I think there might be something off with my camera position calculations or other parts of the process, but since I'm just starting out, I can't figure it out on my own. Could you help me spot where things might be going wrong in the code below? Thanks a lot!

import open3d as o3d
import numpy as np
import re

mesh = o3d.io.read_triangle_mesh("akitchen.ply")
mesh.compute_vertex_normals()

def load_camera_intrinsics(file_path):
with open(file_path, 'r') as f:
lines = f.readlines()

camera_params_line = None
for line in lines:
    if not line.startswith('#'):
        camera_params_line = line.strip()
        break

if camera_params_line is None:
    raise ValueError("No camera parameters found in the file.")

params = camera_params_line.split()
camera_id = int(params[0])  # CAMERA_ID
model = params[1]            # MODEL
width = int(params[2])       # WIDTH
height = int(params[3])      # HEIGHT
fx = float(params[4])        # fx
fy = float(params[5])        # fy
cx = float(params[6])        # cx
cy = float(params[7])        # cy

# print(f"fx: {fx}, fy: {fy}, cx: {cx}, cy: {cy}")

intrinsic = o3d.camera.PinholeCameraIntrinsic()
intrinsic.set_intrinsics(width, height, fx, fy, cx, cy)
return intrinsic, width, height

intrinsic, width, height = load_camera_intrinsics("cameras.txt")

camera_params = o3d.camera.PinholeCameraParameters()
camera_params.intrinsic = intrinsic

file_name = "frame-000357.pose.txt"
frame_number = re.search(r'frame-(\d+)', file_name).group(1)

poses = []
with open(file_name, 'r') as f:
lines = f.readlines()
for i in range(0, len(lines), 4):
pose_block = []
for j in range(4):
if i + j < len(lines):
pose_row = np.fromstring(lines[i + j], dtype=float, sep=' ')
pose_block.append(pose_row)
if len(pose_block) == 4:
pose = np.vstack(pose_block)
poses.append(pose)

vis = o3d.visualization.Visualizer()
vis.create_window(width=width, height=height, visible=True)
vis.add_geometry(mesh)

for i, pose in enumerate(poses):
extrinsic = pose

R_T = pose[:3, :3]
t = pose[:3, 3]

print(f"Extrinsic matrix for pose {i}: \n{extrinsic}")

camera_position = pose[:3, 3]

bounding_box = mesh.get_axis_aligned_bounding_box()
object_center = bounding_box.get_center()

distance = np.linalg.norm(camera_position - object_center)
print("Camera to Object Center Distance:", distance)

print(f"Camera Position for pose {i}: {camera_position}")

R= R_T
lookat = camera_position + R[:, 2]
up = -R[:, 1]
front = -R[:, 2]

ctr = vis.get_view_control()
ctr.set_lookat(lookat)
ctr.set_up(up)
ctr.set_front(front)
ctr.set_zoom(1)

image = vis.capture_screen_float_buffer(do_render=True)
image = np.asarray(image)

from PIL import Image

image = (image * 255).astype(np.uint8)
Image.fromarray(image).save("rendered_image.png")

depth = vis.capture_depth_float_buffer(do_render=True)

vis.poll_events()
vis.update_renderer()

depth_image = np.asarray(depth)

depth_image_min = np.min(depth_image)
depth_image_max = np.max(depth_image)

if depth_image_max > depth_image_min:
    depth_image = (depth_image - depth_image_min) / (depth_image_max - depth_image_min) * 255
else:
    depth_image.fill(0)

depth_image = depth_image.astype(np.uint8)

depth_image_path = f"depth_{frame_number}_{i:04d}.png"

depth_array = np.asarray(depth_image)

depth_height, depth_width = depth_image.shape
print(f"Depth image resolution: {depth_width}x{depth_height}")

o3d.io.write_image(depth_image_path, o3d.geometry.Image(depth_image))
print(f"Saved depth image: {depth_image_path}")

vis.destroy_window()

The text was updated successfully, but these errors were encountered:

v-pnk · 2024-11-07T20:05:51Z

Hi @alpha571,
At first sight, there will be issues with depth units and quantization.

The following line normalizes your depths into the range between 0 and 255, and you lose the original units of the rendered depths. Our depth maps are stored in the original units of the rendered mesh (e.g., meters).

depth_image = (depth_image - depth_image_min) / (depth_image_max - depth_image_min) * 255

Another issue might be storing the depth maps in uint8. That gives you only 256 different values, which might look OK if you look at the depth map as an image but will look very bad when you try to back-project it into 3D space (create a point cloud). I usually store depth maps in 16-bit floats.

depth_image = depth_image.astype(np.uint8)

I implemented an Open3D renderer some time ago. You can use that one, or use it to debug your own rendering pipeline. It uses Open3D OffscreenRenderer, so the overall structure is a bit different.
https://github.com/v-pnk/3dv-tools/blob/main/scripts/renderer_o3d.py

alpha571 · 2024-11-08T13:06:47Z

Dear v-pnk ,
Thank you very much for your reply. When I tried to use the rendering pipeline you provided, there were some problems related to pycolmap version, so I would like to know which version of pycolmap you used. Thank you very much for your help.

v-pnk · 2024-11-08T19:12:14Z

pycolmap 0.3.0 or 0.4.0 should work.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error rendering depth image #14

Error rendering depth image #14

alpha571 commented Nov 7, 2024

v-pnk commented Nov 7, 2024

alpha571 commented Nov 8, 2024

v-pnk commented Nov 8, 2024

Error rendering depth image #14

Error rendering depth image #14

Comments

alpha571 commented Nov 7, 2024

v-pnk commented Nov 7, 2024

alpha571 commented Nov 8, 2024

v-pnk commented Nov 8, 2024