Added support for REPLICA datasets. Added viz of traj errors.

luigifreda · Dec 19, 2024 · fa5383a · fa5383a
1 parent 0c8aa8f
commit fa5383a
Show file tree

Hide file tree

Showing 18 changed files with 617 additions and 215 deletions.
diff --git a/README.md b/README.md
@@ -5,42 +5,43 @@ Author: **[Luigi Freda](https://www.luigifreda.com)**
 <!-- TOC -->
 
 - [pySLAM v2.2.6](#pyslam-v226)
-  - [Install](#install)
-    - [Main requirements](#main-requirements)
-    - [Ubuntu](#ubuntu)
-    - [MacOS](#macos)
-    - [Docker](#docker)
-    - [How to install non-free OpenCV modules](#how-to-install-non-free-opencv-modules)
-    - [Troubleshooting and performance issues](#troubleshooting-and-performance-issues)
-  - [Usage](#usage)
-    - [Feature tracking](#feature-tracking)
-    - [Loop closing](#loop-closing)
-      - [Vocabulary management](#vocabulary-management)
-      - [Vocabulary-free loop closing](#vocabulary-free-loop-closing)
-    - [Volumetric reconstruction pipeline](#volumetric-reconstruction-pipeline)
-    - [Depth prediction](#depth-prediction)
-    - [Save and reload a map](#save-and-reload-a-map)
-    - [Relocalization in a loaded map](#relocalization-in-a-loaded-map)
-    - [Trajectory saving](#trajectory-saving)
-    - [SLAM GUI](#slam-gui)
-    - [Monitor the logs for tracking, local mapping, and loop closing simultaneously](#monitor-the-logs-for-tracking-local-mapping-and-loop-closing-simultaneously)
-  - [Supported components and models](#supported-components-and-models)
-    - [Supported local features](#supported-local-features)
-    - [Supported matchers](#supported-matchers)
-    - [Supported global descriptors and local descriptor aggregation methods](#supported-global-descriptors-and-local-descriptor-aggregation-methods)
-        - [Local descriptor aggregation methods](#local-descriptor-aggregation-methods)
-        - [Global descriptors](#global-descriptors)
-    - [Supported depth prediction models](#supported-depth-prediction-models)
-  - [Datasets](#datasets)
-    - [KITTI Datasets](#kitti-datasets)
-    - [TUM Datasets](#tum-datasets)
-    - [EuRoC Dataset](#euroc-dataset)
-  - [Camera Settings](#camera-settings)
-  - [Comparison pySLAM vs ORB-SLAM3](#comparison-pyslam-vs-orb-slam3)
-  - [Contributing to pySLAM](#contributing-to-pyslam)
-  - [References](#references)
-  - [Credits](#credits)
-  - [TODOs](#todos)
+    - [1. Install](#1-install)
+        - [1.1. Main requirements](#11-main-requirements)
+        - [1.2. Ubuntu](#12-ubuntu)
+        - [1.3. MacOS](#13-macos)
+        - [1.4. Docker](#14-docker)
+        - [1.5. How to install non-free OpenCV modules](#15-how-to-install-non-free-opencv-modules)
+        - [1.6. Troubleshooting and performance issues](#16-troubleshooting-and-performance-issues)
+    - [2. Usage](#2-usage)
+        - [2.1. Feature tracking](#21-feature-tracking)
+        - [2.2. Loop closing](#22-loop-closing)
+            - [2.2.1. Vocabulary management](#221-vocabulary-management)
+            - [2.2.2. Vocabulary-free loop closing](#222-vocabulary-free-loop-closing)
+        - [2.3. Volumetric reconstruction pipeline](#23-volumetric-reconstruction-pipeline)
+        - [2.4. Depth prediction](#24-depth-prediction)
+        - [2.5. Save and reload a map](#25-save-and-reload-a-map)
+        - [2.6. Relocalization in a loaded map](#26-relocalization-in-a-loaded-map)
+        - [2.7. Trajectory saving](#27-trajectory-saving)
+        - [2.8. SLAM GUI](#28-slam-gui)
+        - [2.9. Monitor the logs for tracking, local mapping, and loop closing simultaneously](#29-monitor-the-logs-for-tracking-local-mapping-and-loop-closing-simultaneously)
+    - [3. Supported components and models](#3-supported-components-and-models)
+        - [3.1. Supported local features](#31-supported-local-features)
+        - [3.2. Supported matchers](#32-supported-matchers)
+        - [3.3. Supported global descriptors and local descriptor aggregation methods](#33-supported-global-descriptors-and-local-descriptor-aggregation-methods)
+                - [3.3.1. Local descriptor aggregation methods](#331-local-descriptor-aggregation-methods)
+                - [3.3.2. Global descriptors](#332-global-descriptors)
+        - [3.4. Supported depth prediction models](#34-supported-depth-prediction-models)
+    - [4. Datasets](#4-datasets)
+        - [4.1. KITTI Datasets](#41-kitti-datasets)
+        - [4.2. TUM Datasets](#42-tum-datasets)
+        - [4.3. EuRoC Datasets](#43-euroc-datasets)
+        - [4.4. Replica Datasets](#44-replica-datasets)
+    - [5. Camera Settings](#5-camera-settings)
+    - [6. Comparison pySLAM vs ORB-SLAM3](#6-comparison-pyslam-vs-orb-slam3)
+    - [7. Contributing to pySLAM](#7-contributing-to-pyslam)
+    - [8. References](#8-references)
+    - [9. Credits](#9-credits)
+    - [10. TODOs](#10-todos)
 
 <!-- /TOC -->
 
@@ -262,8 +263,7 @@ Some quick information about the non-trivial GUI buttons of `main_slam.py`:
 - `Step`: Enter in the *Step by step mode*. Press the button `Step` a first time to pause. Then, press it again to make the pipeline process a single new frame.
 - `Save`: Save the map into the file `map.json`. You can visualize it back by using the script `/main_map_viewer.py` (as explained above). 
 - `Reset`: Reset SLAM system. 
-- `Draw Grount Truth`: In the case a groundtruth is loaded (e.g. with *KITTI*, *TUM*, *EUROC* datasets), you can visualize it by pressing this button. The groundtruth trajectory will be visualized and progressively aligned to the estimated trajectory: The more the number of samples in the estimated trajectory the better the computed alignment.  
-
+- `Draw Grount Truth`:  If a ground truth dataset is loaded (e.g., from KITTI, TUM, EUROC, or REPLICA), you can visualize it by pressing this button. The ground truth trajectory will be displayed in 3D and progressively aligned (approximately every 30 frames) with the estimated trajectory. The alignment improves as more samples are added to the estimated trajectory. After ~20 frames, if the button is pressed, a window will appear showing the Cartesian alignment errors (ground truth vs. estimated trajectory) along the axes.  
 
 ### Monitor the logs for tracking, local mapping, and loop closing simultaneously
 
@@ -438,12 +438,18 @@ $ python associate.py PATH_TO_SEQUENCE/rgb.txt PATH_TO_SEQUENCE/depth.txt > asso
 3. Select the corresponding calibration settings file (parameter `TUM_DATASET: cam_settings:` in the file `config.yaml`).
 
 
-### EuRoC Dataset
+### EuRoC Datasets
 
 1. Download a sequence (ASL format) from http://projects.asl.ethz.ch/datasets/doku.php?id=kmavvisualinertialdatasets (check this direct [link](http://robotics.ethz.ch/~asl-datasets/ijrr_euroc_mav_dataset/))
 2. Use the script `groundtruth/generate_euroc_groundtruths_as_tum.sh` to generate the TUM-like groundtruth files `path + '/' + name + '/mav0/state_groundtruth_estimate0/data.tum'` that are required by the `EurocGroundTruth` class.
 3. Select the corresponding calibration settings file (parameter `EUROC_DATASET: cam_settings:` in the file `config.yaml`).
 
+
+### Replica Datasets
+
+1. You can download the zip file containing all the sequences with `wget https://cvg-data.inf.ethz.ch/nice-slam/data/Replica.zip`. Then, uncompress it and deploy the files as you wish.
+2. Select the corresponding calibration settings file (parameter `REPLICA_DATASET: cam_settings:` in the file `config.yaml`).
+
 --- 
 ## Camera Settings
 
@@ -496,7 +502,7 @@ Suggested material:
 * *[Visual Place Recognition: A Tutorial](https://arxiv.org/pdf/2303.03281)*
 * *[Bags of Binary Words for Fast Place Recognition in Image Sequences](http://doriangalvez.com/papers/GalvezTRO12.pdf)*
 
-Moreover, you may want to have a look at the OpenCV [guide](https://docs.opencv.org/3.0-beta/modules/calib3d/doc/camera_calibration_and_3d_reconstruction.html) or [tutorials](https://opencv-python-tutroals.readthedocs.io/en/latest/py_tutorials/py_tutorials.html).  
+Moreover, you may want to have a look at the OpenCV [guide](https://docs.opencv.org/4.x/index.html) or [tutorials](https://docs.opencv.org/4.x/d6/d00/tutorial_py_root.html).  
 
 ---
 ## Credits 

diff --git a/camera.py b/camera.py
@@ -74,7 +74,7 @@ def __init__(self, config):
         self.cx = cx
         self.cy = cy
 
-        self.D = np.array(D,dtype=np.float32) # np.array([k1, k2, p1, p2, k3])  distortion coefficients 
+        self.D = np.array(D,dtype=float) # np.array([k1, k2, p1, p2, k3])  distortion coefficients 
         self.is_distorted = np.linalg.norm(self.D) > 1e-10
 
         self.fps = fps 

diff --git a/config.py b/config.py
@@ -258,36 +258,36 @@ def stereo_settings(self):
             left, right = {}, {}
             if 'LEFT.D' in self.cam_settings:
                 left_D = self.cam_settings['LEFT.D']
-                left_D = np.array(left_D['data'],dtype=np.float).reshape(left_D['rows'], left_D['cols'])
+                left_D = np.array(left_D['data'],dtype=float).reshape(left_D['rows'], left_D['cols'])
                 left['D'] = left_D
             if 'LEFT.K' in self.cam_settings:
                 left_K = self.cam_settings['LEFT.K']
-                left_K = np.array(left_K['data'],dtype=np.float).reshape(left_K['rows'], left_K['cols'])
+                left_K = np.array(left_K['data'],dtype=float).reshape(left_K['rows'], left_K['cols'])
                 left['K'] = left_K
             if 'LEFT.R' in self.cam_settings:
                 left_R = self.cam_settings['LEFT.R']
-                left_R = np.array(left_R['data'],dtype=np.float).reshape(left_R['rows'], left_R['cols'])
+                left_R = np.array(left_R['data'],dtype=float).reshape(left_R['rows'], left_R['cols'])
                 left['R'] = left_R
             if 'LEFT.P' in self.cam_settings:
                 left_P = self.cam_settings['LEFT.P']
-                left_P = np.array(left_P['data'],dtype=np.float).reshape(left_P['rows'], left_P['cols'])
+                left_P = np.array(left_P['data'],dtype=float).reshape(left_P['rows'], left_P['cols'])
                 left['P'] = left_P
 
             if 'RIGHT.D' in self.cam_settings:
                 right_D = self.cam_settings['RIGHT.D']
-                right_D = np.array(right_D['data'],dtype=np.float).reshape(right_D['rows'], right_D['cols'])
+                right_D = np.array(right_D['data'],dtype=float).reshape(right_D['rows'], right_D['cols'])
                 right['D'] = right_D
             if 'RIGHT.K' in self.cam_settings:
                 right_K = self.cam_settings['RIGHT.K']
-                right_K = np.array(right_K['data'],dtype=np.float).reshape(right_K['rows'], right_K['cols'])
+                right_K = np.array(right_K['data'],dtype=float).reshape(right_K['rows'], right_K['cols'])
                 right['K'] = right_K
             if 'RIGHT.R' in self.cam_settings:
                 right_R = self.cam_settings['RIGHT.R']
-                right_R = np.array(right_R['data'],dtype=np.float).reshape(right_R['rows'], right_R['cols'])
+                right_R = np.array(right_R['data'],dtype=float).reshape(right_R['rows'], right_R['cols'])
                 right['R'] = right_R 
             if 'RIGHT.P' in self.cam_settings:
                 right_P = self.cam_settings['RIGHT.P']
-                right_P = np.array(right_P['data'],dtype=np.float).reshape(right_P['rows'], right_P['cols'])
+                right_P = np.array(right_P['data'],dtype=float).reshape(right_P['rows'], right_P['cols'])
                 right['P'] = right_P         
 
             if len(left) > 0 and len(right) > 0:

diff --git a/config.yaml b/config.yaml
@@ -49,6 +49,7 @@ DATASET:
   #type: EUROC_DATASET  
   #type: KITTI_DATASET
   #type: TUM_DATASET
+  #type: REPLICA_DATASET
   type: VIDEO_DATASET
   #type: FOLDER_DATASET
   #type: LIVE_DATASET  # Not recommended for current development stage
@@ -100,13 +101,22 @@ EUROC_DATASET:
   #name: V203   
   # 'settings' will be used when sensor_type: : 'mono'
   settings: settings/EuRoC_mono.yaml
-  # 'settings_stereo' will be used when sensor_type: : 'stereo' (if available)
+  # 'settings_stereo' will be used when sensor_type: 'stereo' (if available)
   settings_stereo: settings/EuRoC_stereo.yaml
   associations: associations.txt
   groundtruth_file: auto
   start_frame_id: 0
 
 
+REPLICA_DATASET:
+  type: replica
+  sensor_type: rgbd # Here, 'sensor_type' can be 'mono' or 'rgbd'
+  base_path: /home/luigi/Work/datasets/rgbd_datasets/replica
+  name: 'room2'
+  settings: settings/REPLICA.yaml # do not forget to correctly set the corresponding camera settings file
+  groundtruth_file: auto
+
+
 VIDEO_DATASET:
   type: video
   sensor_type: mono   # Here, 'sensor_type' can be only 'mono' 

diff --git a/dataset.py b/dataset.py
@@ -39,9 +39,10 @@ class DatasetType(SerializableEnum):
     KITTI = 2
     TUM = 3
     EUROC = 4
-    VIDEO = 5
-    FOLDER = 6  # generic folder of pics 
-    LIVE = 7
+    REPLICA = 5
+    VIDEO = 6
+    FOLDER = 7  # generic folder of pics
+    LIVE = 8
 
 
 @register_class
@@ -102,6 +103,8 @@ def dataset_factory(config):
         dataset = TumDataset(path, name, sensor_type, associations, start_frame_id, DatasetType.TUM)
     if type == 'euroc':
         dataset = EurocDataset(path, name, sensor_type, associations, start_frame_id, DatasetType.EUROC, config) 
+    if type == 'replica':
+        dataset = ReplicaDataset(path, name, sensor_type, associations, start_frame_id, DatasetType.REPLICA)
     if type == 'video':
         dataset = VideoDataset(path, name, sensor_type, associations, timestamps, start_frame_id, DatasetType.VIDEO)   
     if type == 'folder':
@@ -594,6 +597,7 @@ def __init__(self, path, name, sensor_type=SensorType.STEREO, associations=None,
         self.image_left_csv_path = '/mav0/cam0/data.csv'
         self.image_right_csv_path = '/mav0/cam1/data.csv'
 
+        # here we read and convert the timestamps to seconds
         timestamps_and_filenames_left = self.read_data(self.path + '/' + self.name + self.image_left_csv_path)
         timestamps_and_filenames_right = self.read_data(self.path + '/' + self.name + self.image_right_csv_path)
 
@@ -607,7 +611,7 @@ def __init__(self, path, name, sensor_type=SensorType.STEREO, associations=None,
         # in case of stereo mode, we rectify the stereo images
         self.stereo_settings = config.stereo_settings
         if self.sensor_type == SensorType.STEREO:
-            Printer.yellow('[EuroDataset] automatically rectifying the stereo images')
+            Printer.yellow('[EurocDataset] automatically rectifying the stereo images')
             if self.stereo_settings is None: 
                 sys.exit('ERROR: we are missing stereo settings in Euroc YAML settings!')   
             width = config.width 
@@ -688,4 +692,56 @@ def getImageRight(self, frame_id):
             else:
                 self._next_timestamp = self._timestamp + self.Ts                  
         self.is_ok = (img is not None)        
+        return img 
+
+
+class ReplicaDataset(Dataset):
+    fps = 25
+    Ts = 1./fps
+    def __init__(self, path, name, sensor_type=SensorType.RGBD, associations=None, start_frame_id=0, type=DatasetType.REPLICA): 
+        super().__init__(path, name, sensor_type, 30, associations, start_frame_id, type)
+        self.environment_type = DatasetEnvironmentType.INDOOR
+        if sensor_type != SensorType.MONOCULAR and sensor_type != SensorType.RGBD:
+            raise ValueError('Video dataset only supports MONOCULAR and RGBD sensor types')          
+        self.fps = ReplicaDataset.fps
+        self.Ts = ReplicaDataset.Ts
+        self.scale_viewer_3d = 0.1
+        if sensor_type == SensorType.MONOCULAR:
+            self.scale_viewer_3d = 0.05             
+        print('Processing Replica Sequence')        
+        self.base_path = self.path + '/' + self.name + '/'
+        # count the number of frames in the path 
+        self.color_paths = sorted(glob.glob(f'{self.base_path}/results/frame*.jpg'))
+        self.depth_paths = sorted(glob.glob(f'{self.base_path}/results/depth*.png'))        
+        self.max_frame_id = len(self.color_paths)   
+        self.num_frames = self.max_frame_id
+        print(f'Number of frames: {self.max_frame_id}')  
+
+    def getImage(self, frame_id):
+        img = None
+        if frame_id < self.max_frame_id:
+            file = self.base_path + f'results/frame{str(frame_id).zfill(6)}.jpg'
+            img = cv2.imread(file)
+            self.is_ok = (img is not None)
+            self._timestamp = frame_id * self.Ts
+            self._next_timestamp = self._timestamp + self.Ts              
+        else:
+            self.is_ok = False     
+            self._timestamp = None                  
+        return img 
+
+    def getDepth(self, frame_id):
+        if self.sensor_type == SensorType.MONOCULAR:
+            return None # force a monocular camera if required (to get a monocular tracking even if depth is available)
+        frame_id += self.start_frame_id
+        img = None
+        if frame_id < self.max_frame_id:
+            file = self.base_path + f'results/depth{str(frame_id).zfill(6)}.png'
+            img = cv2.imread(file, cv2.IMREAD_UNCHANGED)
+            self.is_ok = (img is not None)
+            self._timestamp = frame_id * self.Ts
+            self._next_timestamp = self._timestamp + self.Ts                
+        else:
+            self.is_ok = False      
+            self._timestamp = None                       
         return img 
diff --git a/global_bundle_adjustment.py b/global_bundle_adjustment.py
@@ -70,7 +70,8 @@
         logging_file=kLogsFolder + '/gba.log'
         local_logger = Logging.setup_file_logger('gba_logger', logging_file, formatter=Logging.simple_log_formatter)
         def print(*args, **kwargs):
-            return local_logger.info(*args, **kwargs)
+            message = ' '.join(str(arg) for arg in args)  # Convert all arguments to strings and join with spaces                
+            return local_logger.info(message, **kwargs)  
 else:
     def print(*args, **kwargs):
         return