Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Try 2D nnUnet #30

Closed
3 tasks done
jcohenadad opened this issue May 2, 2023 · 19 comments
Closed
3 tasks done

Try 2D nnUnet #30

jcohenadad opened this issue May 2, 2023 · 19 comments
Assignees

Comments

@jcohenadad
Copy link
Member

jcohenadad commented May 2, 2023

Given the high performance of the nnUnet in general, it would be a good idea to have a benchmark with this architecture.

Todo (update if necessary):

Related to #17

Tagging @naga-karthik @louisfb01 @valosekj for help

@plbenveniste
Copy link
Collaborator

plbenveniste commented May 10, 2023

Hi !
I am trying to convert the dataset from BIDS to nnUnet. When trying to create the json file from the BIDS, I struggle with something. Indeed, it seems that the current "create_msd_json_from_bids.py" file doesn't work in the case where there are two possible suffixes. With the zurich-mouse dataset, we have : _label-GM_mask and _label-WM_mask.
Have I missed something ? Or should I modify the file so that it works with two different suffixes ?

@naga-karthik @valosekj

@valosekj
Copy link
Member

valosekj commented May 10, 2023

Hi @plbenveniste!

Indeed, it seems that the current "create_msd_json_from_bids.py" file doesn't work in the case where there are two possible suffixes.

create_msd_json_from_bids.py is the script compatible with nnUNetv1 MONAI. We are currently developing a new convert_bids_to_nnUnetv2.py script compatible with nnUnetv2.

Regarding two suffixes; this is exactly what we are working on in another project. I guess @naga-karthik has an updated version of the conversion script compatible with two suffixes.
UPDATE: sorry, I was wrong; we are working with two labels, not two suffixes. I am not sure if we have an up-to-date script for two suffixes, then. If you could help with updating the existing data-conversion scripts, we would be grateful!

@plbenveniste
Copy link
Collaborator

plbenveniste commented May 10, 2023

Hi @valosekj !
Thank you for your answer. In my question I meant two '--label-suffix' so I think it's equivalent to what you are working on. The goal in my case is to have one label for grey matter and one label for white matter.
I'll look into it and tell you if I come up with a working solution.
However, do you have any useful resources which I could use to move further ?

@naga-karthik
Copy link
Member

naga-karthik commented May 10, 2023

Hey @plbenveniste, thanks for your questions! Let us go through them step-by-step:

I am trying to convert the dataset from BIDS to nnUnet.

Unless I missed something, shouldn't you be using the convert_bids_to_nnunet.py to (presumably) train an nnUNet model? The file create_msd_json_from_bids.py is specifically for training MONAI models and hence the image, label pairings have more options

it seems that the current "create_msd_json_from_bids.py" file doesn't work in the case where there are two possible suffixes.

yes, you're right. In most cases, it assumes that there are two or more "contrasts" but a single (common) label.

With the zurich-mouse dataset, we have : _label-GM_mask and _label-WM_mask.

A quick workaround for this would be to slightly modify the group_by_contrasts if condition here to have two labels instead of two contrasts (i.e. label_0000 for GM_mask and label_0001 for WM_mask). This is assuming that your dataset does not have multiple sessions (if it does, then you'd have to modify the if condition for group_by_sessions instead)

However, do you have any useful resources which I could use to move further ?

If the above does not work, then you could check out the dataset conversion instructions on nnUNet's repo. Even monai's tutorials have some examples on dataset conversion.

Hope this helps!

@plbenveniste
Copy link
Collaborator

Indeed, I was mistaken. I thought I had to use the create_msd_json_from_bids.py to create the dict_split.json file. I ended up creating it by hand.
Thank you for your help.
I am now trying to figure out how to use two labels in the case of convert_bids_to_nnunet.py.

@jcohenadad
Copy link
Member Author

@plbenveniste Can you please move your developments under a branch in this repository? It will be easier to manage/crossref discussions/issues/code/results if development stays in one repos. Thx!

@plbenveniste
Copy link
Collaborator

Done.
I called the branch plb/nnunet

@jcohenadad
Copy link
Member Author

Done. I called the branch plb/nnunet

great! could you also open a PR so we can discuss developments / code design in the PR directly

@plbenveniste
Copy link
Collaborator

plbenveniste commented May 31, 2023

Update :

  • removed the notebook (which was useless)
  • updated the requirements.txt file
  • modified the conversion file to take every image and use those that are labeled in the training and those that aren't for inference. (it doesn't require a split_dict file anymore)
  • removed the split_dict file
  • uploaded everything on romane (waiting for a gpu to be free)

To do:

  • update ReadMe file
  • run a training with 5 fold (similarly to Julien's 80/20 split)
  • run inference
  • interpret results

@plbenveniste
Copy link
Collaborator

plbenveniste commented Jun 1, 2023

Screen Shot 2023-06-01 at 11 17 39 AM

Training crashed at epoch 74. No information in the training_log_1.txt regarding the reasons of the crash
Observation: it seems that the model is converging towards a low pseudo-dice value ~0.2

UPDATE: crashed because I wasn't using screen

@plbenveniste
Copy link
Collaborator

image

It seems that at the 100th epoch the model had converged: pseudo dice= 0.2. Afterwards, we can see the dice score not changing but the training loss diminishes: the model is overfitting.

Let's run an inference to see what the model does.

@plbenveniste
Copy link
Collaborator

image
Example of segmentation mask

image
image
Prediction of the model

The model performs similarly to what we gave him as segmentation masks. It is good at segmentation but only on a few slices: that's because annotations were only performed on a few masks.

Solution:

  • trying the 2d nnU-Net configuration to compare with the 3d configuration
  • extracting the labeled slices and retraining the model on those slices only: the downside is that it takes us back to a 2D model.

@jcohenadad
Copy link
Member Author

@plbenveniste The 3D kernel cannot 'properly' be applied in this scenario, because as you noted, slices are only sparsely annotated. To train a 3D kernel, you need ground truth with adjacent slices annotated. So yes, 2D nnUnet would be the way to go.

@plbenveniste
Copy link
Collaborator

plbenveniste commented Jun 5, 2023

Training and testing with the 2D nnU-Net:
The nnU-Net performs around 0,8 in terms of pseudo-dice.
But every prediction outputs nothing. This makes sense since more than 80% of the slices in the train dataset is not annotated.

Solution:

  • extracting every labeled slice in order to retrain the 2d model on those slices only ?
  • use multiple 3d nnU-Net prediction to get slices annotated at different levels of the spinal cord and aggregate them afterwards ?

@jcohenadad what do you advise ?

@plbenveniste
Copy link
Collaborator

I tried using multiple runs of a 3D nnU-Net, in order to get different slices annotated and concatenate them in order to end-up with a full annotation of the spinal cord. It didn't work as the model always outputs the annotations for the same slices.

@jcohenadad jcohenadad changed the title Try 3D nnUnet Try 2D nnUnet Jun 7, 2023
@jcohenadad
Copy link
Member Author

use multiple 3d nnU-Net prediction to get slices annotated at different levels of the spinal cord and aggregate them afterwards ?

no, this is not the right approach. A 3D UNet only makes sense if there is spatial autocorrelation across th 3rd dimension, which is not the case if you concatenate slices that are physically far apart.

I would just do 2D nnUnet, not 3D. And in fact, I noticed that the title of the issue said "3D nnUnet", but it was a mistake, i meant 2D nnUnet.

@plbenveniste
Copy link
Collaborator

plbenveniste commented Jun 8, 2023

UPDATE:
I built a script to transform the dataset and extract every labelled slice. It outputs the following files: a nifti file for the slice of the MR image and 2 nifti files for the mask (one for WM and one for GM). The dataset folders are built to have the same organisation (so that it works for the convert_bids_to_nnunet.py file.

TO DO:

  • train the model on this dataset
  • test if if works better (I hope that the 2D nnU-Net model can accept nifti files of different sizes because it will be trained on (200,200,1) but tested on (200,200,500))
  • if above doesn't work: convert each nifti file of shape (200,200,500) into 500 nifti files of shape (200,200,1) (or the other way around)

@plbenveniste
Copy link
Collaborator

UPDATE:

TO DO:

  • Try inference (and see if the dimension difference between training files and testing files causes a problem)

@plbenveniste
Copy link
Collaborator

Training results:
image

Pseudo-dice on validation set: 0.89

Testing results
image
Fully labelled nifti files: how to determine if correct ?
Some bugs among the labels:
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants