Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scale-MAE: A Scale-Aware Masked Autoencoder for Multiscale Geospatial Representation Learning #8

Open
Dongwoo-Im opened this issue Nov 29, 2023 · 0 comments

Comments

@Dongwoo-Im
Copy link
Owner

Dongwoo-Im commented Nov 29, 2023

github : https://github.com/bair-climate-initiative/scale-mae


[Motivation]

  • 위성 도메인에서는 image sacle이 다양하기에, image 상 거리와 실제 거리가 상이함
  • scale unware training에서는 아무리 많이 학습하더라도 unseen case에 대한 일반화 성능을 보장하기 어려움

[Main factor]

  • GSDPE (Ground Sample Distance Positional Encoding) : position and scale 이해 가능
  • Laplacian-pyramid decoder : multi-scale represenation 학습 가능
  • MAE variants 중 scale-aware 특성과 laplacian pyramid 사용한 경우는 본인들이 처음이라고 주장

[Main Figure]

image

[GSDPE]

image
image

  • original PE에 g/G term이 추가되어 이를 통해 scale-aware 가능
  • MAE encoder 들어가기 전, Demask 과정 총 두번에 걸쳐 주입됨
    • 관련 insight, ablation 찾아보기
      image

[Image]

  • $I_{hr}$ = initial higher resolution image : random crop (448x448) from orignal image
  • $I$ = input image : $I_{hr}$ downsample to 224x224
  • high-freq GT : $I_{hr}$ downsample to 56x56 and upsample to 448x448 and subtract from $I_{hr}$
    • capture object edges, roads, and building outlines
  • low-freq GT : $I_{hr}$ downsample to 14x14 and upsample to 224x224
    • capture color gradients and landscapes

[Decoder]

  • decoding : standard MAE decoder (8 layers -> 3 layers)
  • upsampling : upsample x2 and x4, passed to laplacian blocks
  • reconstruction : laplacian blocks (feature mapping, upsample, reconstruction) with L1 loss and L2 loss

[Evaluation]

image
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant