Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use active learning based on uncertainty to augment dataset #35

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

jcohenadad
Copy link
Member

This PR introduces the following algorithm:

  • Run predictions using models published in https://github.com/ivadomed/model_seg_mouse-sc_wm-gm_t1/releases/tag/v0.2
  • Compute STD across predictions
  • Average STD per slice (ignore zeros and nans)
  • If STD is above a threshold, remove prediction for that slice.
  • Save predictions under derivatives/labels_active_learning
  • Train new model with uncertainty-filtered predicted labels

Fixes #31

@jcohenadad
Copy link
Member Author

jcohenadad commented May 13, 2023

Output prediction_std on sub-mouse1/anat/sub-mouse1_chunk-1_T1w with version 534f16a

image

prediction_std.csv

Based on this plot, slices 416 --> 440 have the highest STD. Let's look at prediction STD from these slices:

Yup! that looks about right:
Screen Shot 2023-05-13 at 6 03 50 PM

@jcohenadad
Copy link
Member Author

jcohenadad commented May 13, 2023

Few things to consider:

  • Would an STD threshold be generalizable across the whole dataset?
  • How reliable is STD computed on very few pixels?

@jcohenadad
Copy link
Member Author

jcohenadad commented May 13, 2023

How reliable is STD computed on very few pixels?

In this example, STD is 0.000305 (relatively low):

Screen Shot 2023-05-13 at 6 08 22 PM

But there is no pixel picked up by majority voting!

Screen Shot 2023-05-13 at 6 08 26 PM

In this other example, STD is high (0.03803):

image

But the segmentation is pretty good!

image

So, I need to find another way to filter based on STD (or another metric...)

@jcohenadad
Copy link
Member Author

jcohenadad commented May 13, 2023

Few ideas to address #35 (comment)

  • A "good looking" segmentation has high uncertainty only at its edges. So maybe I could ignore STD computed at the edge of the segmentation?
  • Instead of computing STD, compute the total overlap (a bit like a Dice, but across more than 2 labels).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Use active learning based on uncertainty of ensemble predictions
1 participant