GitHub - RuoyuChen10/SMDL-Attribution: [ICLR 2024 Oral] Less is More: Fewer Interpretable Region via Submodular Subset Selection

【ICLR 2024 Oral 🔥】Less is More: Fewer Interpretable Region via Submodular Subset Selection

If you like our project, please give us a star ⭐ on GitHub for latest update.

📰 News & Update

[2024.08.22] Our approach now supports audio attribtuon on foundation model, we use ImageBind as an example! Welcome to try it according to the tutorial!
[2024.06.16] Our approach now supports medical multimodal model Quilt interpretation! Welcome to try it according to the tutorial!
[2024.06.04] Our approach now supports multi-gpus interpretation proccessing, please refer to the ./scripts fold!
[2024.06.04] Our approach now supports CLIP interpretation! Welcome to try it according to the tutorial!
[2024.04.22] Our approach now supports LanguageBind interpretation! Welcome to try it according to the tutorial!
[2024.04.11] Our approach now supports multi-modal models with ViT as backbone (ImageBind, Pytorch only)! Welcome to try it according to the tutorial!
[2024.01.17] The original code is available now! Welcome to try it according to the tutorial!
[2024.01.16] The paper has been accepted by ICLR 2024 and selected for oral presentation!

🛠️ Environment (Updating)

Our method will both support keras and pytorch two deep learning frameworks. You can first install pytorch.

opencv-python
opencv-contrib-python
mtutils
tqdm
scipy
scikit-learn
scikit-image
matplotlib==3.7.1
seaborn==3.7.1
xplique>=1.0.3

Our original code is based on Keras, and the method of verification on the ViT model will be completely dependent on Pytorch.

conda create -n smdl python=3.10
conda activate smdl
python3 -m pip install tensorflow[and-cuda]

pip install git+https://github.com/facebookresearch/segment-anything.git

🐳 Model Zoo

Note: Our method will no more support TensorFlow/Keras, but focus on PyTorch.

Recognition Models (Please download and put the models to the path ckpt/keras_model):

Datasets	Model
Celeb-A	keras-ArcFace-R100-Celeb-A.h5
VGG-Face2	keras-ArcFace-R100-VGGFace2.h5
CUB-200-2011	cub-resnet101.h5, cub-resnet101-new.h5, cub-efficientnetv2m.h5, cub-mobilenetv2.h5, cub-vgg19.h5

Uncertainty Estimation Models (Please download and put the models to the path ckpt/pytorch_model):

Datasets	Model
Celeb-A	edl-101-10177.pth
VGG-Face2	edl-101-8631.pth
CUB-200-2011	cub-resnet101-edl.pth

😮 Highlights

Sub-Region Division Method	Org. Prediction Score	Highest Prediction Score	Insertion AUC Score
SLICO	0.7262	0.9522	0.7604
SEEDS	0.7262	0.9918	0.8862
Prior Saliency Map + Patch	0.7262	0.9710	0.7236
Segment Anything Model	0.7262	0.9523	0.6803

Audio classification (on multimodal foundation model ImageBind) attribution:

Medical multimodal model debugging:

🗝️ How to Run (Updating)

If you want to see how to apply this to your own model, please refer to the jupyter notebooks in ./tutorial/ first.

Note: We first publish how to evaluate attribution for multimodal models and how to evaluate it.

Multi GPUs, please refer to the ./scripts fold, for example:

./scripts/clip_multigpu.sh

Then, you may get a saved intermediate result in the path submodular_results/imagenet-clip-vitl/slico-0.0-0.05-1.0-1.0.

Evaluate the Insertion and Deletion metrics:

python -m evals.eval_AUC_faithfulness --explanation-dir submodular_results/imagenet-clip-vitl/slico-0.0-0.05-1.0-1.0

you may get the results:

Insertion AUC Score: 0.7550
Deletion AUC Score: 0.0814

👍 Acknowledgement

Xplique: a Neural Networks Explainability Toolbox

Score-CAM: a third-party implementation with Keras.

Segment-Anything: a new AI model from Meta AI that can "cut out" any object, in any image, with a single click.

CLIP: a neural network trained on a variety of (image, text) pairs. It can be instructed in natural language to predict the most relevant text snippet, given an image, without directly optimizing for the task

ImageBind: ImageBind learns a joint embedding across six different modalities - images, text, audio, depth, thermal, and IMU data. It enables novel emergent applications ‘out-of-the-box’ including cross-modal retrieval, composing modalities with arithmetic, cross-modal detection and generation.

LanguageBind: LanguageBind is a language-centric multimodal pretraining approach, taking the language as the bind across different modalities because the language modality is well-explored and contains rich semantics.

✏️ Citation

@inproceedings{chen2024less,
  title={Less is More: Fewer Interpretable Region via Submodular Subset Selection},
  author={Chen, Ruoyu and Zhang, Hua and Liang, Siyuan and Li, Jingzhi and Cao, Xiaochun},
  booktitle={The Twelfth International Conference on Learning Representations},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 77 Commits
baseline_attribution		baseline_attribution
configs		configs
datasets		datasets
evals		evals
examples		examples
image		image
models		models
mtcnn		mtcnn
scripts		scripts
submodular_attribution		submodular_attribution
tools		tools
tutorial		tutorial
visualization		visualization
README.md		README.md
SAM_mask_generate.py		SAM_mask_generate.py
bpe		bpe
insight_face_models.py		insight_face_models.py
requirement.txt		requirement.txt
utils.py		utils.py
xplique_addons.py		xplique_addons.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

【ICLR 2024 Oral 🔥】Less is More: Fewer Interpretable Region via Submodular Subset Selection

If you like our project, please give us a star ⭐ on GitHub for latest update.

📰 News & Update

🛠️ Environment (Updating)

🐳 Model Zoo

😮 Highlights

🗝️ How to Run (Updating)

👍 Acknowledgement

✏️ Citation

About

Releases

Packages

Languages

RuoyuChen10/SMDL-Attribution

Folders and files

Latest commit

History

Repository files navigation

【ICLR 2024 Oral 🔥】Less is More: Fewer Interpretable Region via Submodular Subset Selection

If you like our project, please give us a star ⭐ on GitHub for latest update.

📰 News & Update

🛠️ Environment (Updating)

🐳 Model Zoo

😮 Highlights

🗝️ How to Run (Updating)

👍 Acknowledgement

✏️ Citation

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages