This repository is about an implementation of Kinect Fusion algorithms. Here the image capturing part in real-time is not included. Instead, a sequence of pre-captured images is used as input for the system. After executing the program, the system will estimate each frame's camera pose and predict 3D surface points based on the signed distance value and the ray casting technique. As such, with more and more frames being processed, the predicted 3D points are gradually aggregated in the global coordinate and an indoor environment can then be reconstructed.
- cmake >= 2.8
- make >= 4.1
- Linux: make is installed by default on most Linux distros
- OpenCV >= 4.1
- The OpenCV 4.1.0 source code can be found here
- gcc/g++ >= 5.4
- Linux: gcc / g++ is installed by default on most Linux distros
- CUDA >= 11.0
- Download CUDA here
- PCL >= 1.7
- Download PCL here
- Clone this repo:
https://github.com/PoChang007/Kinect_Fusion_SLAM.git
cd Kinect_Fusion_SLAM
mkdir build && cd build
- Compile:
cmake .. && make -j4
- Run it:
./kinfu_imp
src/
contains all*.cpp
and*.cu
filesdata/
contains the RGB-D images of a static background scanning
Class Name | Description |
---|---|
KinfuPipeline | Where camera intrinsic parameters and a voxel grid for TSDF are initialized. Parallel processing tasks fo each frame are called here |
SystemUtility | For image processings and storages of raw 3D data and predicted 3D points |
ThreeDViewer | Visualization of 3D point clouds. Each frame's predicted 3D points will be aligned in the global coordinate |
The system_interface.cpp
is used to get start Kinect fusion system. The aggregated 3D point clouds are rendered in the 3D viewer. More information can be found in Usage.
Cuda file | Description |
---|---|
projective_tsdf | Each voxel's tsdf is updated based on the voxel's projection onto the current frame's depth |
voxel_traversal | Each ray will traverse the voxel grid to check zero crossing regions. If a zero crossing regions is found, the surface points can be predicted |
raw_vertices | Each pixel in the depth image is converted to the 3D point |
sensor_pose_estimation | Find eligible point pairs between the current frame and the previous frame and use ICP to refine the current frame of camera pose |
Once the program is started, the 3D point clouds are rendered in 3D viewer. The current processing frame of color and depth images are displayed in the different windows.
In 3D viewer,
- Press
r
to reset camera viewpoint - Press
e
to end the program
- In this repo the RGB-D data is captured by Kinect v1. It is workable for other RGB-D sensors as long as the image size, camera's extrinsics and extrinsics between color/depth cameras are known
- When the current frame's processing is done, its corresponding camera pose is temporarily stored in
cv::Mat Kinfu::KinfuPipeline::extrinsic_matrix
until the processing of the next frame is finished - Default settings:
- The size of voxel grid is 3.6m * 3.6m * 3m
- The size of each voxel is 0.01m * 0.01m * 0.01m
- For 3D visualization, point clouds are drawn per 5th frame
- The system was originally implemented in Matlab. The voxel traversal algorithm for ray uses Matlab index style (start at 1)
The project is released under the MIT License.