Skip to content

Latest commit

 

History

History
60 lines (50 loc) · 5.46 KB

model.md

File metadata and controls

60 lines (50 loc) · 5.46 KB

Model Zoo

Pretraining

In paper, we use Slowfast R50 + CLIP-B/32 for pretraining (row 3), and fine-tune on single specified benchmark. We release the row 1, 2 and 4 to power practice usage.

Video Enc. Text Enc. Pretraining Fine-tuning Checkpoints
CLIP-B/32 CLIP-B/32 4M - Google Drive
CLIP-B/32 CLIP-B/32 4M QVHL + Charades + NLQ + TACoS + ActivityNet + DiDeMo Google Drive
Slowfast R50 + CLIP-B/32 CLIP-B/32 4M - Google Drive
Slowfast R50 + CLIP-B/32 CLIP-B/32 4M QVHL + Charades + NLQ + TACoS + ActivityNet + DiDeMo Google Drive

For below downstream tasks, checkpoints are trained by Slowfast R50 + CLIP-B/32 features.

Joint Moment Retrieval and Highlight Detection

Please follow the instruction here to submit the test set results to Codelab.

Datasets (MR test) mAP avg (HD test) HIT@1 (MR val) mAP avg (HD val) HIT@1 Checkpoints + Configs + Prediction + Tensorboard Log
QVHL 35.47 60.96 36.13 61.81 Google Drive
QVHL (w/ PT) 43.63 66.28 45.44 68.77 Google Drive

Moment Retrieval

Datasets R1 @ 0.3 mIoU Checkpoints + Configs + Prediction + Tensorboard Log
NLQ (w/ PT) 11.74 7.88 Google Drive
Charades (w/ PT) 72.63 52.17 Google Drive
Tacos (w/ PT) 56.11 38.63 Google Drive

Highlight Detection

Datasets Domain mAP Checkpoints + Configs + Prediction
Youtube (w/ PT) dog 74.25 Google Drive
Youtube (w/ PT) gymnastics 78.89 Google Drive
Youtube (w/ PT) parkour 74.39 Google Drive
Youtube (w/ PT) skating 84.87 Google Drive
Youtube (w/ PT) skiing 75.13 Google Drive
Youtube (w/ PT) surfing 83.85 Google Drive
Datasets Domain mAP Checkpoints + Configs + Prediction + Tensorboard Log
TVSum (w/ PT) BK 91.78 Google Drive
TVSum (w/ PT) BT 90.47 Google Drive
TVSum (w/ PT) DS 77.57 Google Drive
TVSum (w/ PT) FM 74.33 Google Drive
TVSum (w/ PT) GA 89.78 Google Drive
TVSum (w/ PT) MS 83.83 Google Drive
TVSum (w/ PT) PK 82.22 Google Drive
TVSum (w/ PT) PR 85.81 Google Drive
TVSum (w/ PT) VT 92.04 Google Drive
TVSum (w/ PT) VU 77.81 Google Drive

Video Summarization

Datasets F1 score Checkpoints + Configs + Prediction + Tensorboard Log
V1 (w/ PT) 49.85 Google Drive
V2 (w/ PT) 56.97 👆
V3 (w/ PT) 59.35 👆
V4 (w/ PT) 40.62 👆