Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error rendering depth image #14

Open
alpha571 opened this issue Nov 7, 2024 · 3 comments
Open

Error rendering depth image #14

alpha571 opened this issue Nov 7, 2024 · 3 comments

Comments

@alpha571
Copy link

alpha571 commented Nov 7, 2024

Dear tsattler and v-pnk,
When I use the ply model and camera pose file you shared to render depth images with Open3D, I'm having trouble getting depth images that match yours in scale. I think there might be something off with my camera position calculations or other parts of the process, but since I'm just starting out, I can't figure it out on my own. Could you help me spot where things might be going wrong in the code below? Thanks a lot!

import open3d as o3d
import numpy as np
import re

mesh = o3d.io.read_triangle_mesh("akitchen.ply")
mesh.compute_vertex_normals()

def load_camera_intrinsics(file_path):
with open(file_path, 'r') as f:
lines = f.readlines()

camera_params_line = None
for line in lines:
    if not line.startswith('#'):
        camera_params_line = line.strip()
        break

if camera_params_line is None:
    raise ValueError("No camera parameters found in the file.")

params = camera_params_line.split()
camera_id = int(params[0])  # CAMERA_ID
model = params[1]            # MODEL
width = int(params[2])       # WIDTH
height = int(params[3])      # HEIGHT
fx = float(params[4])        # fx
fy = float(params[5])        # fy
cx = float(params[6])        # cx
cy = float(params[7])        # cy

# print(f"fx: {fx}, fy: {fy}, cx: {cx}, cy: {cy}")

intrinsic = o3d.camera.PinholeCameraIntrinsic()
intrinsic.set_intrinsics(width, height, fx, fy, cx, cy)
return intrinsic, width, height

intrinsic, width, height = load_camera_intrinsics("cameras.txt")

camera_params = o3d.camera.PinholeCameraParameters()
camera_params.intrinsic = intrinsic

file_name = "frame-000357.pose.txt"
frame_number = re.search(r'frame-(\d+)', file_name).group(1)

poses = []
with open(file_name, 'r') as f:
lines = f.readlines()
for i in range(0, len(lines), 4):
pose_block = []
for j in range(4):
if i + j < len(lines):
pose_row = np.fromstring(lines[i + j], dtype=float, sep=' ')
pose_block.append(pose_row)
if len(pose_block) == 4:
pose = np.vstack(pose_block)
poses.append(pose)

vis = o3d.visualization.Visualizer()
vis.create_window(width=width, height=height, visible=True)
vis.add_geometry(mesh)

for i, pose in enumerate(poses):
extrinsic = pose

R_T = pose[:3, :3]
t = pose[:3, 3]

print(f"Extrinsic matrix for pose {i}: \n{extrinsic}")

camera_position = pose[:3, 3]

bounding_box = mesh.get_axis_aligned_bounding_box()
object_center = bounding_box.get_center()

distance = np.linalg.norm(camera_position - object_center)
print("Camera to Object Center Distance:", distance)

print(f"Camera Position for pose {i}: {camera_position}")

R= R_T
lookat = camera_position + R[:, 2]
up = -R[:, 1]
front = -R[:, 2]

ctr = vis.get_view_control()
ctr.set_lookat(lookat)
ctr.set_up(up)
ctr.set_front(front)
ctr.set_zoom(1)

image = vis.capture_screen_float_buffer(do_render=True)
image = np.asarray(image)

from PIL import Image

image = (image * 255).astype(np.uint8)
Image.fromarray(image).save("rendered_image.png")

depth = vis.capture_depth_float_buffer(do_render=True)

vis.poll_events()
vis.update_renderer()

depth_image = np.asarray(depth)

depth_image_min = np.min(depth_image)
depth_image_max = np.max(depth_image)

if depth_image_max > depth_image_min:
    depth_image = (depth_image - depth_image_min) / (depth_image_max - depth_image_min) * 255
else:
    depth_image.fill(0)

depth_image = depth_image.astype(np.uint8)

depth_image_path = f"depth_{frame_number}_{i:04d}.png"

depth_array = np.asarray(depth_image)

depth_height, depth_width = depth_image.shape
print(f"Depth image resolution: {depth_width}x{depth_height}")

o3d.io.write_image(depth_image_path, o3d.geometry.Image(depth_image))
print(f"Saved depth image: {depth_image_path}")

vis.destroy_window()

@v-pnk
Copy link
Collaborator

v-pnk commented Nov 7, 2024

Hi @alpha571,
At first sight, there will be issues with depth units and quantization.

The following line normalizes your depths into the range between 0 and 255, and you lose the original units of the rendered depths. Our depth maps are stored in the original units of the rendered mesh (e.g., meters).

depth_image = (depth_image - depth_image_min) / (depth_image_max - depth_image_min) * 255

Another issue might be storing the depth maps in uint8. That gives you only 256 different values, which might look OK if you look at the depth map as an image but will look very bad when you try to back-project it into 3D space (create a point cloud). I usually store depth maps in 16-bit floats.

depth_image = depth_image.astype(np.uint8)

I implemented an Open3D renderer some time ago. You can use that one, or use it to debug your own rendering pipeline. It uses Open3D OffscreenRenderer, so the overall structure is a bit different.
https://github.com/v-pnk/3dv-tools/blob/main/scripts/renderer_o3d.py

@alpha571
Copy link
Author

alpha571 commented Nov 8, 2024

Dear v-pnk ,
Thank you very much for your reply. When I tried to use the rendering pipeline you provided, there were some problems related to pycolmap version, so I would like to know which version of pycolmap you used. Thank you very much for your help.

@v-pnk
Copy link
Collaborator

v-pnk commented Nov 8, 2024

pycolmap 0.3.0 or 0.4.0 should work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants