We introduce the Simultaneous Multiple Object detection and Pose Estimation Network (SMOPE-Net) that performs multi-target detection and pose estimation tasks in an end-to-end manner. SMOPE-Net extracts the object model features and fuses the model features with the image features to infer the object categories, 2-D detection boxes, poses, and visibility. We perform experiments and comparisons to existing methods on multiple datasets including the new KITTI-3D dataset and the existing LineMod-O datasets. And our method achieves better than existing methods forpose estimation on both datasets.
SMOPE-Net on KITTI-3D dataset
Comparison results on KITTI-3D dataset
SMOPE-Net on LineMod-O dataset
Schematics of end-to-end trainable SMOPE-Net: The network expects images and
-
Linux, CUDA>=10.0, GCC>=5.4
-
Python>=3.8
We recommend you to use Anaconda to create a conda environment:
conda create -n detr_6dof python=3.8 pip
Then, activate the environment:
conda activate detr_6dof
-
PyTorch>=1.9.1 (following instructions here)
For example, if your CUDA version is 10.2, you could install pytorch and torchvision as following:
conda install -c pytorch pytorch=1.9.1 torchvision cudatoolkit=10.2
-
PyTorch3D (following instructions here)
- Using Anaconda Cloud,on Linux only
conda install pytorch3d -c pytorch3d
- Installing From Source(Install from GitHub).
pip install "git+https://github.com/facebookresearch/pytorch3d.git"
-
Other requirements
pip install -r requirements.txt
cd ./models/ops
sh ./make.sh
-
Clone this repository
git clone <repository url>
-
Download the dataset from here.
-
Run the creat_dataset.py
cd dataset_labelImg3d
python creat_dateset.py
-
Download the network weights file
-
Modify the
<project path>/configs/__init__.py
to7. configure_name = "config.json" 8. # configure_name = 'config_linemod.json'
-
Modify the
<project path>/configs/config.json
1. { 2. "dataset_name": "KITTI3D", 3. "dataset_path": <your downloaded KITTI-3D folder>, 4. "poses": true, 5. "eval": true, ... 11. "output_dir": "../output_dir_pose", ... 76. "train": { ... 79. "resume": <your downloaded weights>,
-
Activate your python environment, and run
cd <project folder> python main.py
-
Modify the
<project path>/configs/__init__.py
to7. configure_name = "config.json" 8. # configure_name = 'config_linemod.json'
-
Modify the
<project path>/configs/config.json
1. { 2. "dataset_name": "KITTI3D", 3. "dataset_path": <your downloaded KITTI-3D folder>, 4. "poses": true, 5. "eval": false, ... 11. "output_dir": "../output_dir_pose",<your save folder> ... 76. "train": { 77. "start_epoch": 0, 78. "end_epoch": 1000, 79. "resume": "", 80. "batch_size": 4 <you can change it according to your gpu capability>, ...
-
Activate your python environment, and run
cd <project folder> python main.py
- Download the training and testing dataset from here
-
Download the network weights file
-
Modify the
<project path>/configs/__init__.py
to7. # configure_name = "config.json" 8. configure_name = 'config_linemod.json'
-
Modify the
<project path>/configs/config_linemod.json
1. { 2. "dataset_name": "Linemod_preprocessed", 3. "dataset_path": <your downloaded KITTI-3D folder>/02, 4. "poses": true, 5. "eval": true, ... 11. "output_dir": "../output_dir_pose", ... 76. "train": { ... 79. "resume": <your downloaded weights>, ...
-
Activate your python environment, and run
cd <project folder> python main.py
-
Modify the
<project path>/configs/__init__.py
to7. # configure_name = "config.json" 8. configure_name = 'config_linemod.json'
-
Modify the
<project path>/configs/config_linemod.json
1. { 2. "dataset_name": "Linemod_preprocessed", 3. "dataset_path": <your downloaded Linemod_preprocessed folder>/02, 4. "poses": true, 5. "eval": false, ... 11. "output_dir": "../output_dir_pose",<your save folder> ... 81. "train": { 82. "start_epoch": 0, 83. "end_epoch": 1000, 84. "resume": "", 85. "batch_size": 4 <you can change it according to your gpu capability>, ...
-
activate your python environment, and run
cd <project folder> python main.py
This work is based on the Pytroch, Pytorch3D and Deformable-DETR. It also inspired by DETR.
The methods provided on this page are published under the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License . This means that you must attribute the work in the manner specified by the authors, you may not use this work for commercial purposes and if you alter, transform, or build upon this work, you may distribute the resulting work only under the same license. If you are interested in commercial usage you can contact us for further options.