Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SSL pre-training and fine-tuning on (64, 256, 256) samples #7

Open
valosekj opened this issue Mar 11, 2024 · 0 comments
Open

SSL pre-training and fine-tuning on (64, 256, 256) samples #7

valosekj opened this issue Mar 11, 2024 · 0 comments

Comments

@valosekj
Copy link
Member

valosekj commented Mar 11, 2024

This issue summarizes initial experiments with self-supervised (SSL) pre-training and fine-tuning using Vision Transformers (ViT). The experiments are based on this MONAI tutorial and code is available within branch jv/vit_unetr_ssl.

The idea is to do self-supervised pre-training on unlabeled images and then do supervised fine-tuning for a specific task, e.g., DCM lesion segmentation.

For simplicity, all experiments have been done so far as single-channel using only T2w contrast.

Pre-training

The pre-training is done on spine-generic multi-subject T2w images using ViTAutoEncscript model by script vit_unetr_ssl/train.py.

First, two augmented views are created for each original training image (see lines here). Then, the contrastive loss is used to bring the two augmented views closer to each other if the views are generated from the same patch; if not it tries to maximize the disagreement.

So far, I have used a spatial size of 64, 256, 256:

image

The pre-training (500 epochs, batch size of 2) on 236/29 train/val images (T2w resampled to 1mm iso) took ~50 hours on a single GPU on romane. I had to set number of workers to 0 due to RuntimeError: Pin memory thread exited unexpectedly. With a higher number of workers, the training would probably be faster.

Training & Validation Curves for pre-training SSL

image

Fine-tuning

The fine-tuning is done on dcm-zurich-lesion patients as a supervised task (i.e., providing T2w images and lesion labels) using a script vit_unetr_ssl/finetune.py. The pre-trained weights are loaded into UNETR model.

@valosekj valosekj changed the title SSL pre-training and fine-tuning using Vision Transformers SSL pre-training and fine-tuning on (64, 256, 256) samples Mar 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant