XrayGPT: Chest Radiographs Summarization using Medical Vision-Language Models.

Omkar Thawakar* , Abdelrahman Shaker* , Sahal Shaji Mullappilly* , Hisham Cholakkal, Rao Muhammad Anwer, Salman Khan, Jorma Laaksonen, and Fahad Shahbaz Khan.

*Equal Contribution

Mohamed bin Zayed University of Artificial Intelligence, UAE

🚀 News

Aug-04 : Our paper has been accepted at BIONLP-ACL 2024 🔥
Jun-14 : Our technical report is released here. 🔥🔥
May-25 : Our technical report will be released very soon. stay tuned!.
May-19 : Our code, models, and pre-processed report summaries are released.

Online Demo

You can try our demo using the provided examples or by uploading your own X-ray here : Link-1 | Link-2 | Link-3 .

About XrayGPT

XrayGPT aims to stimulate research around automated analysis of chest radiographs based on the given x-ray.
The LLM (Vicuna) is fine-tuned on medical data (100k real conversations between patients and doctors) and ~30k radiology conversations to acquire domain specific and relevant features.
We generate interactive and clean summaries (~217k) from free-text radiology reports of two datasets (MIMIC-CXR and OpenI). These summaries serve to enhance the performance of LLMs through fine-tuning the linear transformation layer on high-quality data. For more details regarding our high-quality summaries, please check Dataset Creation.
We align frozen medical visual encoder (MedClip) with a fune-tuned LLM (Vicuna), using simple linear transformation.

Getting Started

Installation

1. Prepare the code and the environment

Clone the repository and create a anaconda environment

git clone https://github.com/mbzuai-oryx/XrayGPT.git
cd XrayGPT
conda env create -f env.yml
conda activate xraygpt

OR

git clone https://github.com/mbzuai-oryx/XrayGPT.git
cd XrayGPT
conda create -n xraygpt python=3.9
conda activate xraygpt
pip install -r xraygpt_requirements.txt

Setup

1. Prepare the Datasets for training

Refer the dataset_creation for more details.

Download the preprocessed annoatations mimic & openi. Respective image folders contains the images from the dataset.

Following will be the final dataset folder structure:

dataset
├── mimic
|    ├── image
|    |   ├──abea5eb9-b7c32823-3a14c5ca-77868030-69c83139.jpg
|    |   ├──427446c1-881f5cce-85191ce1-91a58ba9-0a57d3f5.jpg
|    |   .....
|    ├──filter_cap.json
├── openi
|    ├── image
|    |   ├──1.jpg
|    |   ├──2.jpg
|    |   .....
|    ├──filter_cap.json
...

3. Prepare the pretrained Vicuna weights

We built XrayGPT on the v1 versoin of Vicuna-7B. We finetuned Vicuna using curated radiology report samples. Download the Vicuna weights from vicuna_weights The final weights would be in a single folder in a structure similar to the following:

vicuna_weights
├── config.json
├── generation_config.json
├── pytorch_model.bin.index.json
├── pytorch_model-00001-of-00003.bin
...

Then, set the path to the vicuna weight in the model config file "xraygpt/configs/models/xraygpt.yaml" at Line 16.

To finetune Vicuna on radiology samples please download our curated radiology and medical_healthcare conversational samples and refer the original Vicuna repo for finetune.Vicuna_Finetune

4. Download the pretrained Minigpt-4 checkpoint

Download the pretrained minigpt-4 checkpoints. ckpt

5. Training of XrayGPT

A. First mimic pretraining stage

In the first pretrained stage, the model is trained using image-text pairs from preprocessed mimic dataset.

To launch the first stage training, run the following command. In our experiments, we use 4 AMD MI250X GPUs.

torchrun --nproc-per-node NUM_GPU train.py --cfg-path train_configs/xraygpt_mimic_pretrain.yaml

2. Second openi finetuning stage

In the second stage, we use a small high quality image-text pair openi dataset preprocessed by us.

Run the following command. In our experiments, we use AMD MI250X GPU.

torchrun --nproc-per-node NUM_GPU train.py --cfg-path train_configs/xraygpt_openi_finetune.yaml

Launching Demo on local machine

Download the pretrained xraygpt checkpoints. link

Add this ckpt in "eval_configs/xraygpt_eval.yaml".

Try gradio demo.py on your local machine with following

python demo.py --cfg-path eval_configs/xraygpt_eval.yaml  --gpu-id 0

Examples

Acknowledgement

MiniGPT-4 Enhancing Vision-language Understanding with Advanced Large Language Models. We built our model on top of MiniGPT-4.
MedCLIP Contrastive Learning from Unpaired Medical Images and Texts. We used medical aware image encoder from MedCLIP.
BLIP2 The model architecture of XrayGPT follows BLIP-2.
Lavis This repository is built upon Lavis!
Vicuna The fantastic language ability of Vicuna is just amazing. And it is open-source!

Citation

If you're using XrayGPT in your research or applications, please cite using this BibTeX:

    @article{Omkar2023XrayGPT,
        title={XrayGPT: Chest Radiographs Summarization using Large Medical Vision-Language Models},
        author={Omkar Thawkar, Abdelrahman Shaker, Sahal Shaji Mullappilly, Hisham Cholakkal, Rao Muhammad Anwer, Salman Khan, Jorma Laaksonen and Fahad Shahbaz Khan},
        journal={arXiv: 2306.07971},
        year={2023}
    }

License

This repository is licensed under CC BY-NC-SA. Please refer to the license terms here.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

XrayGPT: Chest Radiographs Summarization using Medical Vision-Language Models.

🚀 News

Online Demo

About XrayGPT

Getting Started

Installation

Setup

5. Training of XrayGPT

Launching Demo on local machine

Examples

Acknowledgement

Citation

License

Files

README.md

Latest commit

History

README.md

File metadata and controls

XrayGPT: Chest Radiographs Summarization using Medical Vision-Language Models.

🚀 News

Online Demo

About XrayGPT

Getting Started

Installation

Setup

5. Training of XrayGPT

Launching Demo on local machine

Examples

Acknowledgement

Citation

License