Cross-view multi-human tracking tries to link human subjects between frames and camera views that contain substantial overlaps. Although cross-view multi-human tracking has received increased attention in recent years, existing datasets still have several issues, including 1) missing real-world scenarios, 2) lacking diverse scenes, 3) owning a limited number of tracks, 4) comprising only static cameras, and 5) lacking standard benchmarks, which hinders the exploration and comparison of cross-view tracking methods.
To solve the above concerns, we present DIVOTrack: a new cross-view multi-human tracking dataset for DIVerse Open scenes with dense tracking pedestrians in realistic and non-experimental environments. In addition, our DIVOTrack contains ten different types of scenarios and 550 cross-view tracks, which surpasses all existing cross-view human tracking datasets. Furthermore, our DIVOTrack contains videos that are collected by two mobile cameras and one unmanned aerial vehicle, allowing us to evaluate the efficacy of methods while dealing with dynamic views. Finally, we present a summary of current methodologies and a set of standard benchmarks with our DIVOTrack to provide a fair comparison and conduct a thorough analysis of current approaches.
- Dataset Description
- Dataset Structure
- Dataset Downloads
- Training Detector
- Single-view Tracking
- Cross-view Tracking
- TrackEval
- Multi_view_Tracking
- MOTChallengeEvalKit_cv_test
The test result of the cross-view MOT baseline method MvMHAT on the DIVOTrack.
The ground truth of the DIVOTrak.
We collect data in 10 different real-world scenarios, named: 'Circle', 'Shop', 'Moving', 'Park', 'Ground', 'Gate1', 'Floor', 'Side', 'Square', 'Gate2'
. All
the sequences are captured by using 3 moving cameras: 'View1', 'View2',
'View3' and are manually synchronized.
The structure of our dataset as:
DIVOTrack
└——————datasets
└——————DIVO
|——————images
| └——————annotations
| └——————dets
| └——————train
| └——————test
└——————labels_with_ids
| └——————train
| └——————test
|——————npy
|——————ReID_format
| └——————bounding_box_test
| └——————bounding_box_train
| └——————query
└——————boxes.json
The whole dataset can download from GoogleDrive. You can decompress each .tar.gz
file in its folder.
After that, you should run generate_ini.py
to generate seqinfo.ini
file.
The training process of our detector is in ./Training_detector/
and the details can see from Training_detector/Readme.md.
The implement of single-view tracking baseline methods is in ./Single_view_Tracking
and the details can see from Single_view_Tracking/Readme.md.
The implement of cross-view tracking baseline methods is in ./Cross_view_Tracking
and the details can see from Cross_view_Tracking/Readme.md.
We evaluation each single-view tracking baseline by ./TrackEval
, and the details can see from TrackEval/Readme.md.
The multi-view tracking results can get from ./Multi_view_Tracking
, the details can see from Multi_view_Tracking/Readme.md
The cross-view evaluation can get from ./MOTChallengeEvalKit_cv_test
, and the details can see from ./MOTChallengeEvalKit_cv_test/Readme.md
Any use whatsoever of this dataset and its associated software shall constitute your acceptance of the terms of this agreement. By using the dataset and its associated software, you agree to cite the papers of the authors, in any of your publications by you and your collaborators that make any use of the dataset, in the following format:
@article{wangdivotrack,
title={DIVOTrack: A Cross-View Dataset for Multi-Human Tracking in DIVerse Open Scenes},
author={Wang, Gaoang and Hao, Shengyu and Zhan, Yibing and Liu, Peiyuan and Liu, Zuozhu and Song, Mingli and Hwang, Jenq-Neng},
year={2022}
}
The license agreement for data usage implies the citation of the paper above. Please notice that citing the dataset URL instead of the publications would not be compliant with this license agreement. You can read the LICENSE from LICENSE.
If any concerns please contact [email protected]