About `window_size` in MOAT #146

edwardyehuang · 2022-11-01T13:28:50Z

I found window_size is None in MOAT

Lines 347 to 360 in eb66c85

    
           def get_model( 
        
               name: str, 
        
               input_shape: list[int], 
        
               window_size: Optional[list[list[int]]] = None, 
        
               survival_rate: Optional[float] = None, 
        
               pool_size: Optional[int] = 3, 
        
               override_config: Optional[dict[str, Any]] = None, 
        
               pretrained_weights_path: Optional[str] = None, 
        
               remove_position_embedding: Optional[bool] = None, 
        
               return_config: Optional[bool] = False, 
        
               strict_loading: Optional[bool] = False, 
        
               use_checkpointing_for_attention: Optional[bool] = False, 
        
               global_attention_at_end_of_moat_stage: Optional[bool] = False, 
        
           ) -> ...:

deeplab2/model/pixel_encoder/moat.py

Line 405 in eb66c85

config.window_size = window_size

Is only global attention used for segmentation tasks?

The text was updated successfully, but these errors were encountered:

Chenglin-Yang · 2022-11-05T01:28:49Z

Thanks for your interest!

Please see

deeplab2/model/layers/moat_blocks.py

Line 329 in 7a01a71

-window_size: A list of two integers, spatial size of input for window

for how to specify the desired window size for the use case.

Our setting can be found in the experimental sections on paper, but I can provide the information here:
For COCO object detection, we use window based attention for the third stage with size 14x14 and global attention for the fourth stage.
For ADE20K semantic segmentation, we use global attention for both third and fourth stages.

edwardyehuang · 2022-11-11T14:35:10Z

Thanks for your interest!

Please see

deeplab2/model/layers/moat_blocks.py

Line 329 in 7a01a71

-window_size: A list of two integers, spatial size of input for window

for how to specify the desired window size for the use case.
Our setting can be found in the experimental sections on paper, but I can provide the information here: For COCO object detection, we use window based attention for the third stage with size 14x14 and global attention for the fourth stage. For ADE20K semantic segmentation, we use global attention for both third and fourth stages.

Thanks for your point out.

I also noticed the implementation of the global window is flawed.

When using the global window size, the current implementation will still record a fixed window size, depending on the input size in the build stage. Therefore, if the given input size is different from the recorded size, the global will be limited or direct raised error (e.g., smaller input size than recorded window size)

Chenglin-Yang · 2022-11-18T19:31:48Z

Thank you for finding this typo.

If you want to evaluate the model with an input size that is different from your training phase, you will need to create another model that is built with that input size and loads the weights. This is how the current tensorflow model works.

edwardyehuang mentioned this issue Nov 11, 2022

Few fixs of Moat #150

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

About `window_size` in MOAT #146

About `window_size` in MOAT #146

edwardyehuang commented Nov 1, 2022

Chenglin-Yang commented Nov 5, 2022

edwardyehuang commented Nov 11, 2022

Chenglin-Yang commented Nov 18, 2022

About window_size in MOAT #146

About window_size in MOAT #146

Comments

edwardyehuang commented Nov 1, 2022

Chenglin-Yang commented Nov 5, 2022

edwardyehuang commented Nov 11, 2022

Chenglin-Yang commented Nov 18, 2022

About `window_size` in MOAT #146

About `window_size` in MOAT #146