Simon Klenk1,2* Marvin Motzet1,2* Lukas Koestler1,2 Daniel Cremers1,2
*equal contribution
1Technical University of Munich (TUM) 2Munich Center for Machine Learning (MCML)
International Conference on 3D Vision (3DV) 2024, Davos, CH
Paper (arXiv) | Video | Poster | BibTeX
Event cameras offer the exciting possibility of tracking the camera's pose during high-speed motion and in adverse lighting conditions. Despite this promise, existing event-based monocular visual odometry (VO) approaches demonstrate limited performance on recent benchmarks. To address this limitation, some methods resort to additional sensors such as IMUs, stereo event cameras, or frame-based cameras. Nonetheless, these additional sensors limit the application of event cameras in real-world devices since they increase cost and complicate system requirements. Moreover, relying on a frame-based camera makes the system susceptible to motion blur and HDR. To remove the dependency on additional sensors and to push the limits of using only a single event camera, we present Deep Event VO (DEVO), the first monocular event-only system with strong performance on a large number of real-world benchmarks. DEVO sparsely tracks selected event patches over time. A key component of DEVO is a novel deep patch selection mechanism tailored to event data. We significantly decrease the pose tracking error on seven real-world benchmarks by up to 97% compared to event-only methods and often surpass or are close to stereo or inertial methods.
During training, DEVO takes event voxel grids
The code was tested on Ubuntu 22.04 and CUDA Toolkit 11.x. We use Anaconda to manage our Python environment.
First, clone the repo
git clone https://github.com/tum-vision/DEVO.git --recursive
cd DEVO
Then, create and activate the Anaconda environment
conda env create -f environment.yml
conda activate devo
Next, install the DEVO package
# download and unzip Eigen source code
wget https://gitlab.com/libeigen/eigen/-/archive/3.4.0/eigen-3.4.0.zip
unzip eigen-3.4.0.zip -d thirdparty
# install DEVO
pip install .
The following steps are only needed if you intend to (re)train DEVO. Please note, the training data have the size of about 1.1TB (rbg: 300GB, evs: 370GB).
Otherwise, skip it and go to here.
First, download all RGB images and depth maps of TartanAir from the left camera (~500GB) to <TARTANPATH>
python thirdparty/tartanair_tools/download_training.py --output-dir <TARTANPATH> --rgb --depth --only-left
Next, generate event voxel grids using vid2e.
python scripts/convert_tartan.py --dirsfile <path to .txt file>
dirsfile
expects a .txt file containing line-separated paths to dirs with .png images (to generate events for these images).
We provide a pretrained model for our simulated event data.
# download model (~40MB)
./download_model.sh
We evaluate DEVO on seven real-world event-based datasets (FPV, VECtor, HKU, EDS, RPG, MVSEC, TUM-VIE). We provide scripts for data preprocessing (undist, ...).
Check scripts/pp_DATASETNAME.py
for the way to preprocess the original datasets. This will create the necessary files for you, e.g. rectify_map.h5
, calib_undist.json
and t_offset_us.txt
.
Make sure you have run the following steps. Your dataset directory structure should look as follows
├── <TARTANPATH>
├── abandonedfactory
├── abandonedfactory_night
├── ...
├── westerndesert
To train DEVO with the default configuration, run
python train.py -c="config/DEVO_base.conf" --name=<your name>
The log files will be written to runs/<your name>
. Please, check train.py
for more options.
Make sure you have run the following steps (downloading pretrained model, data and preprocessing data).
python evals/eval_evs/eval_DATASETNAME_evs.py --datapath=<DATASETPATH> --weights="DEVO.pth" --stride=1 --trials=1 --expname=<your name>
The qualitative and quantitative results will be written to results/DATASETNAME/<your name>
. Check eval_rpg_evs.py
for more options.
- Code and model are released.
- Code for simulation is released.
If you find our work useful, please cite our paper:
@inproceedings{klenk2023devo,
title = {Deep Event Visual Odometry},
author = {Klenk, Simon and Motzet, Marvin and Koestler, Lukas and Cremers, Daniel},
booktitle = {International Conference on 3D Vision, 3DV 2024, Davos, Switzerland,
March 18-21, 2024},
pages = {739--749},
publisher = {{IEEE}},
year = {2024},
}
We thank the authors of the following repositories for publicly releasing their work:
- DPVO
- TartanAir
- vid2e
- E2Calib
- rpg_trajectory_evaluation
- Event-based Vision for VO/VIO/SLAM in Robotics
This work was supported by the ERC Advanced Grant SIMULACRON.