1703.06870 - Mask R-CNN

♥ DL , CV , RCNN, Object Detection

Extend Faster RCNN by adding a branch for predicting segmentation masks for each ROI, in parallel with existing branch for classification and bbox regression
Review of Faster RCNN
- Two stages
  - Get candidate object bboxes from RPN
  - Extract features using RolPool from each candidate box
  - Finally perform classification and bbox regression.
- The features used by both stages can be shared for faster inference
- Faster-RCNN is not designed for pixel-to-pixel alignment
  - RoiPool performs coarse spatial quantization for feature extraction
  - How to fix: change it to RoiAlign which is quantization-free and can faithfully preserves exact spatial locations
Mask R-CNN
- Two stages
  - Similar to faster RCNN in the first stage(RPN)
  - When predicting the class and box offset, Mask R-CNN also outputs a binary mask(third branch) for each RoI
- $L = L_{cls} + L_{box} + L_{mask}$
  - $L_{mask}$ uses per-pixel sigmoid instead of softmax
  - For an ROI associated with ground-truch class k, $L_{mask}$ is only defined on k-th mask
  - Other mask outputs are irrelevent $\rightarrow$ we can generate masks for each class without competition among classes
  - Decouple mask and class prediction
- Mask branch:
  - Use a FCN(full collected network) to predics m*m mask from each ROI
  - Need ROI features to be well aligned $\rightarrow$ RoIAlign
- RoIAlign:
  - In RoIPool, quantization leads to misalignments
    - Does not affect classification
    - Affects pixel-accurate prediction
  - In RoIAlign, we avoid quantization: round(x / 16) $\rightarrow$ x / 16
- Archetecture:
  - Extend two kinds of backbone heads:
    - ResNet
    - FPN(feature pyramid network)
Mask RCNN is easy to generalize
- Instance segmentation
- Bbox object detection
- Person keypoint detection
  - Model a keypoint’s location as a one-hot m * m mask, where onlu a single pixel is labeled as foreground
  - Adopt Mask R-CNN to predict K masks, one for each of K keypoint types (e.g., left shoulder, right elbow).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

1703.06870 - Mask R-CNN.md

1703.06870 - Mask R-CNN.md

1703.06870 - Mask R-CNN

Files

1703.06870 - Mask R-CNN.md

Latest commit

History

1703.06870 - Mask R-CNN.md

File metadata and controls

1703.06870 - Mask R-CNN