Here we provide the performance of Unicorn on multiple tasks (Object Detection, Instance Segmentation, and Object Tracking). The complete model weights and the corresponding training logs are given by the links.
The object detector of Unicorn is pretrained and evaluated on COCO. In this step, there is no segmentation head and the network is trained only using box-level annotations.
Experiment | Backone | Box AP | Model | Log |
---|---|---|---|---|
unicorn_det_convnext_large_800x1280 | ConvNext-Large | 53.7 | model | log |
unicorn_det_convnext_tiny_800x1280 | ConvNext-Tiny | 53.1 | model | log |
unicorn_det_r50_800x1280 | ResNet-50 | 51.7 | model | log |
Please note that this part is optional. The training of downstream tracking tasks do not rely on this. So please feel free to skip it unless you are interested in instance segmentation on COCO. In this step, a segmentaiton head is appended to the pretrained object detector. Then parameters of the object detector are frozen and only the segmentation head is optimized. So the box AP would be the same as that in the previous stage. Here we provide the results of the model with convnext-tiny backbone.
Experiment | Backone | Mask AP | Model | Log |
---|---|---|---|---|
unicorn_inst_convnext_tiny_800x1280 | ConvNext-Tiny | 43.2 | model | log |
There are some inner conflicts among existing MOT benchmarks.
- Different benchmarks focus on different object classes. For example, MOT Challenge, BDD100K, and TAO include 1, 8, and 800+ object classes.
- Different benchmarks have different labeling rules. For example, the MOT challenge always annotates the whole person, even when the person is heavily occluded or cut by the image boundary. However, the other benchmarks do not share the same rule.
These factors make it difficult to train one unified model for different MOT benchmarks. To deal with this problem, Unicorn trains two unified models. To be specific, the first model can simultaneously deal with SOT, BDD100K, VOS, and BDD100K MOTS. The second model can simultaneously deal with SOT, MOT17, VOS, and MOTS Challenge. The results of SOT and VOS are reported using the first model.
The results of the first group of models are shown as below.
Experiment | Input Size | LaSOT AUC (%) |
BDD100K mMOTA (%) |
DAVIS17 J&F (%) |
BDD100K MOTS mMOTSA (%) |
Model | Log Stage1 |
Log Stage2 |
---|---|---|---|---|---|---|---|---|
unicorn_track_large_mask | 800x1280 | 68.5 | 41.2 | 69.2 | 29.6 | model | log1 | log2 |
unicorn_track_tiny_mask | 800x1280 | 67.7 | 39.9 | 68.0 | 29.7 | model | log1 | log2 |
unicorn_track_tiny_rt_mask | 640x1024 | 67.1 | 37.5 | 66.8 | 26.2 | model | log1 | log2 |
unicorn_track_r50_mask | 800x1280 | 65.3 | 35.1 | 66.2 | 30.8 | model | log1 | log2 |
The results of the second group of models are shown as below.
Experiment | Input Size | MOT17 MOTA (%) |
MOTS sMOTSA (%) |
Model | Log Stage1 |
Log Stage2 |
---|---|---|---|---|---|---|
unicorn_track_large_mot_challenge_mask | 800x1280 | 77.2 | 65.3 | model | log1 | log2 |
We also provide task-specific models for users who are only interested in part of tasks.
Experiment | Input Size | LaSOT AUC (%) |
BDD100K mMOTA (%) |
DAVIS17 J&F (%) |
BDD100K MOTS mMOTSA (%) |
Model | Log Stage1 |
Log Stage2 |
---|---|---|---|---|---|---|---|---|
unicorn_track_tiny_sot_only | 800x1280 | 67.5 | - | - | - | model | log1 | - |
unicorn_track_tiny_mot_only | 800x1280 | - | 39.6 | - | - | model | log1 | - |
unicorn_track_tiny_vos_only | 800x1280 | - | - | 68.4 | - | model | - | log2 |
unicorn_track_tiny_mots_only | 800x1280 | - | - | - | 28.1 | model | - | log2 |
The downloaded checkpoints should be organized in the following structure
${UNICORN_ROOT}
-- Unicorn_outputs
-- unicorn_det_convnext_large_800x1280
-- best_ckpt.pth
-- unicorn_det_convnext_tiny_800x1280
-- best_ckpt.pth
-- unicorn_det_r50_800x1280
-- best_ckpt.pth
-- unicorn_track_large_mask
-- latest_ckpt.pth
-- unicorn_track_tiny_mask
-- latest_ckpt.pth
-- unicorn_track_r50_mask
-- latest_ckpt.pth
...