Skip to content

Latest commit

 

History

History
9 lines (5 loc) · 1.09 KB

MANet.md

File metadata and controls

9 lines (5 loc) · 1.09 KB

Fully Motion-Aware Network for Video Object Detection

Architecture

Summary

Similar with FGFA, but in addtion to pixel-level feature calibration and aggregagtion, MANet proposes the motion pattern reasoning module to dynamically combine (learnable soft weights) pixel-level and instance-level calibration according to the motion (optical flow by FlowNet). Instance-level calibration is achieved by regressing relative movements $(\Delta x , \Delta y , \Delta w , \Delta h)$ on the optical flow estimation according to proposal positions of reference frame. Final feaure maps for detection network (R-FCN) are the aggregation of nearby (13 frames in total) calibrated feature maps. Pixel-level calibration achieves better improvements for non-rigid movements while instance-level calibration is better for rigid movements and occlusion cases.