Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create testing dataset #2

Merged
merged 19 commits into from
Aug 7, 2020
Merged

Create testing dataset #2

merged 19 commits into from
Aug 7, 2020

Conversation

lrouhier
Copy link
Contributor

@lrouhier lrouhier commented Jul 24, 2020

This PR aims at uploading testing data used in the ivadomed testing suite

Images and files:

  • sub-XXX --> Bids folder with multiple contrast (T2w,T2star and T1w)
  • derivatives/sub-XXX --> derivatives folder with seg-manual labels for all contrasts, labels-disc-manual labels for T2w, lesion-manual label for all contrasts (dummy seg),
  • bounding_box.json --> dictionnary to test certain function from the bounding_box.py file
  • dataset_description.json --> needed file to descript dataset
  • participants.csv --> table with subject names and potential metadata
  • config file for train/test command --> We need to separate the 2 (it's json files so it does not take much space)
  • temporary_results.csv --> Used to test ivadomed/scrip/compare_models.py

Subject used is sub-unf01 from https://github.com/ivadomed/data_example_spinegeneric
derivatives/labels/sub-unf01_lesion-manual.nii.gz are dummy seg. The goal is to use seg_manual as a ROI and lesion-manual as target.

@jcohenadad jcohenadad changed the title Create testing data for distriploy release Create testing dataset Jul 24, 2020
@jcohenadad
Copy link
Member

@jcohenadad
Copy link
Member

@lrouhier to squeeze size, can you:

  • resample to 1mm in-plane
  • convert to float32

@jcohenadad
Copy link
Member

In the readme: indicate the changes to the dataset (e.g., sub-selection of subject, processing, etc.)

@lrouhier lrouhier force-pushed the lr/distriploy_release branch from 167f44f to 646f106 Compare July 27, 2020 14:41
@lrouhier
Copy link
Contributor Author

lrouhier commented Jul 27, 2020

@lrouhier to squeeze size, can you:

  • resample to 1mm in-plane
  • convert to float32

done and previous commit has been squashed to reduc repo size. Repo size is now 33 mb.

@lrouhier
Copy link
Contributor Author

lrouhier commented Aug 4, 2020

@jcohenadad I think it is ready

README.md Outdated
@@ -1,2 +1,10 @@
# data-testing
Light-weighted data for testing/tutorial

# Content
* sub-unf01: Bids folder with multiple contrast (T2w,T2star and T1w). Original image from [ivadomed spine generic example repo](https://github.com/ivadomed/data_example_spinegeneric/releases/tag/r20200907) resampled to 1mm isotropic and converted to float32 to reduce repo size. Subject was randomly selected among the available subjects.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it says "sub-test001" in the bounding_box_dict.json file

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed in 9bce745

Copy link
Member

@jcohenadad jcohenadad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

images have 80 slices, which seems unnecessarily large. Even when testing 3D kernels, this is probably an overkill. Why not cropping to 10 slices? we will reduce the size from ~30MB to ~4MB.
that's a lot of BW we can save given that those testing data are being downloaded for each CI run (ie. multiple times a day)

@lrouhier
Copy link
Contributor Author

lrouhier commented Aug 4, 2020

images have 80 slices, which seems unnecessarily large. Even when testing 3D kernels, this is probably an overkill. Why not cropping to 10 slices? we will reduce the size from ~30MB to ~4MB.
that's a lot of BW we can save given that those testing data are being downloaded for each CI run (ie. multiple times a day)

Yes will do and squash the commit that created the file to have a lighter repo. I'll add the transformation to the README

@lrouhier lrouhier force-pushed the lr/distriploy_release branch 2 times, most recently from 9f66146 to 450ae5a Compare August 4, 2020 21:25
@lrouhier lrouhier force-pushed the lr/distriploy_release branch from 1f34e4b to 0aa6b80 Compare August 4, 2020 21:37
@lrouhier
Copy link
Contributor Author

lrouhier commented Aug 4, 2020

We have a repo of 8 mb do you want me to crop the image further ?

@lrouhier lrouhier force-pushed the lr/distriploy_release branch from f8834d7 to 9bce745 Compare August 5, 2020 02:15
@lrouhier
Copy link
Contributor Author

lrouhier commented Aug 5, 2020

I forced push again. we need to have at least 16 slice per image. This is neede because the filter on the 3D unet has an issue with only ten slices (due to stride and convolution if I am correct)

@jcohenadad
Copy link
Member

I forced push again. we need to have at least 16 slice per image. This is neede because the filter on the 3D unet has an issue with only ten slices (due to stride and convolution if I am correct)

i don't see the images anymore in the latest commit 9bce745.
while you're at it, maybe you can also reduce the dim along X and Y. Be "greedy".

@lrouhier
Copy link
Contributor Author

lrouhier commented Aug 5, 2020

Yes the images are not there because i was doing some tests to be sure that it would work this time (and wanted to avoid having to clean the repo a fourth time).

@jcohenadad
Copy link
Member

Yes the images are not there because i was doing some tests to be sure that it would work this time (and wanted to avoid having to clean the repo a fourth time).

i think you don't need to bother with that, as we will do a squash merge. I believe this gets rid of the history (in case you re-uploaded binaries multiple times)-- @kouzu can you confirm?

@lrouhier
Copy link
Contributor Author

lrouhier commented Aug 5, 2020

I just added the images that works with the tests. I cropped all dimension (details of the bounding box in readme). Repo is 4.5 mb I think (zip download is 2.5 mb).

@jcohenadad
Copy link
Member

on 2a55d35 json files are missing:

julien-macbook:~/code/ivadomed-data-testing/sub-unf01/anat $ ll
total 5016
drwxr-xr-x  5 julien  staff      160  5 Aug 14:11 .
drwxr-xr-x  3 julien  staff       96  5 Aug 14:11 ..
-rw-r--r--  1 julien  staff  1067161  5 Aug 14:11 sub-unf01_T1w.nii.gz
-rw-r--r--  1 julien  staff  1062495  5 Aug 14:11 sub-unf01_T2star.nii.gz
-rw-r--r--  1 julien  staff   433424  5 Aug 14:11 sub-unf01_T2w.nii.gz

@jcohenadad
Copy link
Member

we should probably set up a procedure for generating json files on images that were processed (because the matrix size does not correspond anymore), but this is out of the scope of this PR.

@lrouhier are the discrepancies between the json and the images (e.g. dim, matrix size, resolution, etc.) a problem when running ivadomed?

@lrouhier
Copy link
Contributor Author

lrouhier commented Aug 5, 2020

No, for now the tests are running without them (because we used the contrast metadata which is in the name). However, They could be useful if we decide to create a test for FILM.

@jcohenadad
Copy link
Member

No, for now the tests are running without them

@jcohenadad jcohenadad closed this Aug 5, 2020
@jcohenadad
Copy link
Member

oopsi!

@jcohenadad jcohenadad reopened this Aug 5, 2020
@jcohenadad
Copy link
Member

No, for now the tests are running without them

so, does it mean that ivadomed's loader doesn't care if there is no json file in the BIDS dataset?

@lrouhier
Copy link
Contributor Author

lrouhier commented Aug 5, 2020

so, does it mean that ivadomed's loader doesn't care if there is no json file in the BIDS dataset?

I think so if it doesn't need it (from here). In the next lines an error is raised if the asked field is missing.

@lrouhier
Copy link
Contributor Author

lrouhier commented Aug 7, 2020

we should probably set up a procedure for generating json files on images that were processed (because the matrix size does not correspond anymore)

Not sure which field I need to change in the json for matrix size, resolution, dim ect .. @jcohenadad

@jcohenadad
Copy link
Member

we should probably set up a procedure for generating json files on images that were processed (because the matrix size does not correspond anymore)

Not sure which field I need to change in the json for matrix size, resolution, dim ect .. @jcohenadad

will be addressed in #3

@jcohenadad
Copy link
Member

I have some concerns with the decision to create random labels not representing the image content ivadomed/ivadomed#392, but we don,t need to fix it here

@jcohenadad jcohenadad merged commit 5e0c60a into master Aug 7, 2020
@jcohenadad jcohenadad deleted the lr/distriploy_release branch August 7, 2020 19:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants