Skip to content

Latest commit

 

History

History
98 lines (71 loc) · 3.62 KB

README.md

File metadata and controls

98 lines (71 loc) · 3.62 KB

Unleashing the Potential of the Diffusion Model in Few-shot Semantic Segmentation

Muzhi Zhu1*,   Yang Liu1*,   Zekai Luo1*,   Chenchen Jing1,   Hao Chen1,   Guangkai Xu1,   Xinlong Wang2,   Chunhua Shen1

1Zhejiang University,   2Beijing Academy of Artificial Intelligence

NeurIPS 2024

🚀 Overview

image

📖 Description

We systematically study four crucial elements of applying the Diffusion Model to Few-shot Semantic Segmentation. For each of these aspects, we propose several reasonable solutions and validate them through comprehensive experiments.

Building upon our observations, we establish the DiffewS framework, which maximally retains the generative framework and effectively utilizes the pre-training prior. Notably, we introduce the first diffusion-based model dedicated to Few-shot Semantic Segmentation, setting the groundwork for a diffusion-based generalist segmentation model.

Paper

🚩 Plan

  • Release the weights.
  • Release the inference code.
  • Release the training code.

👻 Getting Started

Installation

Preparing the environment following GenPercept.

conda create -n diffews python=3.10
conda activate diffews
pip install -r requirements.txt
python -m pip install 'git+https://github.com/facebookresearch/detectron2.git'

Dataset

Preparing the dataset following Matcher You only need to download the COCO 2014 dataset.

Training

This script is tested on single 24G 4090.

bash scripts/train_cocofold0_4090_nocrop_lr1_nearest_fold1_7shot_ori_v3.sh

Evaluation

Download the pre-trained model weights from here.

CUDA_VISIBLE_DEVICES=0 bash  scripts/eval_coco2014_rthres_1shot_nosample.sh weight/coco_fold0
CUDA_VISIBLE_DEVICES=0 bash  scripts/eval_coco2014_rthres_5shot_nosample.sh weight/coco_fold0
CUDA_VISIBLE_DEVICES=0 bash  scripts/eval_coco2014_rthres_1shot_nosample_fold0.sh weight/incontext

🎫 License

For academic use, this project is licensed under the 2-clause BSD License. For commercial use, please contact Chunhua Shen.

🖊️ Citation

If you find this project useful in your research, please consider to cite:

@article{zhu2024unleashing,
  title={Unleashing the Potential of the Diffusion Model in Few-shot Semantic Segmentation},
  author={Zhu, Muzhi and Liu, Yang and Luo, Zekai and Jing, Chenchen and Chen, Hao and Xu, Guangkai and Wang, Xinlong and Shen, Chunhua},
  journal={arXiv preprint arXiv:2410.02369},
  year={2024}
}

Acknowledgement

SegGPT, Matcher, Marigold, GenPercept