Super-class guided Transformer for Zero-Shot Attribute Classification

Sehyung Kim*, Chanhyeong Yang*, Jihwan Park, Taehoon Song, Hyunwoo J. Kim†.

AAAI 2025

This is the official implementation of AAAI 2025 paper "Super-class guided Transformer for Zero-Shot Attribute Classification"

Environment Setting

git clone https://github.com/mlvlab/SugaFormer.git
cd SugaFormer
conda create -n sugaformer python==3.9
conda activate sugaformer
pip install torch==1.13.1+cu117 torchvision==0.14.1+cu117 torchaudio==0.13.1 --extra-index-url https://download.pytorch.org/whl/cu117
pip install -r requirements.txt

Dataset Preparation

To run experiments for VAW, you need both the images from the Visual Genome dataset and the annotation files. Follow the steps below:

Download the Visual Genome images from the link.
Download the annotation files for VAW experiments from the link.

Organize the Data

After downloading the Visual Genome images and annotation files, organize them into the following directory structure:

data/
└── vaw/
     ├── images/
     │   ├── VG_100K/
     │   └── VG_100K_2/
     │
     └── annotations/
         ├── train.json
         ├── test.json
         ├── ...

Training

VAW Fully-Supervised

Train the model in the fully-supervised setting:

./configs/vaw/train_fs.sh

VAW Zero-Shot (base2novel)

Train the model in the zero-shot setting:

./configs/vaw/train_zs.sh

Evaluation

VAW Fully-Supervised

Evaluate the model in the fully-supervised setting:

./configs/vaw/eval_fs.sh

VAW Zero-Shot (base2novel)

Evaluate the model in the zero-shot setting:

./configs/vaw/eval_zs.sh

Acknowledgements

This repository is built upon the following works:

DETR (Facebook Research): The codebase we built upon and the foundation for our base model.
LAVIS (Salesforce): Pre-trained Vision-Language Models (BLIP2) that we utilized for feature extraction and knowledge transfer.

Contact

If you have any questions, please create an issue on this repository or contact at [email protected].

Citation

If you find our work interesting, please consider giving a ⭐ and citation.

@article{kim2025super,
  title={Super-class guided Transformer for Zero-Shot Attribute Classification},
  author={Kim, Sehyung and Yang, Chanhyeong and Park, Jihwan and Song, Taehoon and Kim, Hyunwoo J},
  journal={arXiv preprint arXiv:2501.05728},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
assets		assets
configs		configs
data		data
datasets		datasets
models		models
tools		tools
util		util
README.md		README.md
engine.py		engine.py
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Super-class guided Transformer for Zero-Shot Attribute Classification

Environment Setting

Dataset Preparation

Organize the Data

Training

VAW Fully-Supervised

VAW Zero-Shot (base2novel)

Evaluation

VAW Fully-Supervised

VAW Zero-Shot (base2novel)

Acknowledgements

Contact

Citation

About

Releases

Packages

Languages

mlvlab/SugaFormer

Folders and files

Latest commit

History

Repository files navigation

Super-class guided Transformer for Zero-Shot Attribute Classification

Environment Setting

Dataset Preparation

Organize the Data

Training

VAW Fully-Supervised

VAW Zero-Shot (base2novel)

Evaluation

VAW Fully-Supervised

VAW Zero-Shot (base2novel)

Acknowledgements

Contact

Citation

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages