- Term project results for AAA534 <Computer Vision> in Korea University
- This work is based on MOTDT which is one of the state-of-the-art algorithm for real-time multiple object tracking
- For more information, please refer to the report file in this repository
- STEP1: Estimate bounding box of frame
t+1
from the current framet
through Kalman Filter - STEP2: Detect object at time
t+1
using R-FCN - STEP3: Filter objects estimated in STEP1 and objects detected in STEP2 through Non-Maximum Suppression
- STEP4: Calculate homography matrix from frame
t
andt+1
- STEP5: Create candidates by linearly transforming the existing object at time
t
through homography matrix obtained in STEP4 - STEP6: Allocate bounding box candidates from STEP3 and STEP5 to each object based on IOU and ReIE features.
- The original model cannot maintain the track ID of object 1 (turned to 101), which is covered by object 105
- Ours maintains the track ID of object 1 and 89 even though they are obscured by object 161 carrying a green bag.
- The original model cannot maintain the track ID of object 427 (turned to 509) due to a sudden change in camera angle
- Ours maintains the track ID of object 515 even though there is a sudden change in camera angle at the end of the clip
Original | Proposed | |
---|---|---|
idf1 | 0.503 | 0.522 |
Mostly Tracked | 59 | 70 |
Mostly Lost | 151 | 152 |
False Positive | 919 | 3,057 |
Num_Misses | 28,580 | 26,781 |
Num_Switches | 200 | 198 |
Num_Fragment | 706 | 574 |
MOTA | 0.428 | 0.421 |
MOTP | 0.152 | 0.164 |
Original | Proposed | |
---|---|---|
idf1 | 0.547 | 0.579 |
Mostly Tracked | 75 | 97 |
Mostly Lost | 94 | 97 |
False Positive | 725 | 3,064 |
Num_Misses | 22,704 | 19,818 |
Num_Switches | 504 | 386 |
Num_Fragment | 1,604 | 806 |
MOTA | 0.524 | 0.538 |
MOTP | 0.094 | 0.116 |
- There has been a clear trade-off between the original and proposed method
- False Positive increased a lot with additional bounding boxes generated by Homography, while Mostly Tracked measure which means the tracking success in the 80% of whole frames improved
- Additionally, number of misses and number of fragments decreased considerably because of supplementary bounding boxes
- Tracking time increased enormously, which is main downside of proposed method