This repository hosts the code for our paper titled "Language Models Meet Anomaly Detection for Better Interpretability and Generalizability", which can also be explored further on our project page.
Our framework is designed to process questions in conjunction with results from anomaly detection methods aiming to provide clinicians with clear, interpretable responses that render anomaly map analyses more intuitive and clinically actionable.
Please cite our paper if you find this repository helpful for your research:
@misc{li2024multiimage,
title={Multi-Image Visual Question Answering for Unsupervised Anomaly Detection},
author={Jun Li and Cosmin I. Bercea and Philip Müller and Lina Felsner and Suhwan Kim and Daniel Rueckert and Benedikt Wiestler and Julia A. Schnabel},
year={2024},
eprint={2404.07622},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
- MI-VQA Dataset: Download from this link and save it to
./data/dataset
.- To preprocess the dataset, run:
cd data python preprocess_dataset.py
- To preprocess the dataset, run:
-
Model Training
- Train the model by navigating to the model's directory and executing the provided script:
cd ./models/VQA sh run.sh
- Training checkpoints will be saved in
./data/ckpts/
.
- Train the model by navigating to the model's directory and executing the provided script:
-
Result Generation
- Generate results by running the inference script:
cd ./models/inference sh run_vqa_inference.sh
- Results will be stored in
./evaluation/res/
.
- Generate results by running the inference script:
-
Result Evaluation
- Evaluate the results using:
cd evaluation python evaluate.py
- Evaluate the results using:
-
GUI Interface
- For a graphical interface, use Streamlit:
cd ./models/inference streamlit run streamlit_gui.py
- For a graphical interface, use Streamlit: