Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add all Kim's DevCCFv1 atlases #441

Merged
merged 18 commits into from
Jan 17, 2025
Merged

Conversation

carlocastoldi
Copy link
Contributor

Description

With this draft PR i ask you your opinion and help to first:

  • allow the script to download the resources from the official OneDrive page
  • have a general review of the script's code quality before adapting it to generate the other combination of atlases/references
  • choose which atlases are worth generating and whether they should update any previous atlas

Lastly, I have tested this script on two setups: on one computer without parallelization and the other with. The run with parallelization didn't manage the memory properly and run OOM on linux, effectively making the generation of the atlas much slower. Does BrainGlobe offer any help in the management of the parallelization or should I do it myself?

P.S. if needed, I can provide the resulting atlas from this script

What is this PR

  • Bug fix
  • Addition of a new feature
  • Other

Why is this PR needed?
To perfection the support to Kim's developing mouse brain atlases

What does this PR do?
It adds a script for generating Kim's DevCCFv1 mouse atlas. For now, it generates only the P14 altas with LSFM@20µm as reference, ignoring the other ages (E11.5, E13.5, E15.5, E18.5, P4 and P56) and references (they are all MRI@50 µm)

References

#425

How has this PR been tested?

The resulting atlas from the script was used within ABBA

Is this a breaking change?

No.

Does this PR require an update to the documentation?

It does, but it is not included yet in this PR

Checklist:

  • The code has been tested locally
  • Tests have been added to cover all new functionality (unit & integration)
  • The documentation has been updated to reflect any changes
  • The code has been formatted with pre-commit

@alessandrofelder
Copy link
Member

Hey @carlocastoldi

Thanks a lot!

Does BrainGlobe offer any help in the management of the parallelization or should I do it myself?

We are aware that past parallelisation efforts are not robust #428, so if you would like to contribute a better way, even in just for one atlas, that might help us address #429 - so that would be great

choose which atlases are worth generating and whether they should update any previous atlas

I am happy to give my thoughts on code quality and how to download from OneDrive but can I first check (Pardon my naivety!): what is the difference here to the existing Kim Developmental atlas that is already packaged? Could this be an extension of that existing script?

@adamltyson
Copy link
Member

what is the difference here to the existing Kim Developmental atlas that is already packaged? Could this be an extension of that existing script?

I think that is an earlier version of the P56 atlas.

@carlocastoldi ideally we would add all these atlases, but the number of ages x reference images is very large. In principle though, there's no reason why we can't add them all. Each would have to be a seperate atlas (at least until we built Atlas API V2). Would you be willing to write a script that can add all of these atlases?

@carlocastoldi
Copy link
Contributor Author

We are aware that past parallelisation efforts are not robust #428, so if you would like to contribute a better way, even in just for one atlas, that might help us address #429 - so that would be great

Oh okay, i see that! It's something I may perhaps look into, but for now it can't be my priority...

can I first check (Pardon my naivety!): what is the difference here to the existing Kim Developmental atlas that is already packaged? Could this be an extension of that existing script?

Oh I tried to understand if the existing "developmental" atlas (it's not really developmental because it's only for P56, when mice are already adult. It's the same age at which Allen developed CCFv3) had any publication associated with it, but as far as i understood, that's only a v0.0.1 of the work they have recently published (v1). I am not 100% sure about it, tho, as i can't find any description/publication about that dataset.
Surely, however, it is not compatible with their v1 as they differ in resolution, coordinate system, ontology and reference.
Rather, these atlases are closer to those of ADMBA, however they would also make them outdated (ADMBA's are very low resolution and hard to work with due to their undefined region boundaries).

Lastly they offer two different resolutions: one at 20µm for LSFM references and one at 31.5µm/40µm/50µm (depending on the age of the atlas) for MRI references and some stereotaxic coordinates (for post-natal atlases).
For now, in this PR, i would focus on adding just LFSM@20µm atlases.

I am happy to give my thoughts on code quality and how to download from OneDrive

Thank you a lot!

@adamltyson
Copy link
Member

For now, in this PR, i would focus on adding just LFSM@20µm atlases.

That seems sensible, thanks!

@carlocastoldi
Copy link
Contributor Author

@carlocastoldi Would you be willing to write a script that can add all of these atlases?

Oh i missed this message, sorry!
But yes, sure. I'm down for it. However at the moment the script assumes that the resources are already downloaded and it does not retrieve them from official channels as they are on OneDrive

@carlocastoldi
Copy link
Contributor Author

only one question remains: would you be okay if this PR also deleted the scripts for generating ADMBA and old KimP56 atlases?

They are all superseded by this new version for the developing mouse. Leaving them all would otherwise overcrowd the list of supported atlases

@adamltyson
Copy link
Member

For now, I think we should keep them. Although the new atlases may be better, there are still researchers using the old ones for ongoing projects.

We also want to ensure that BrainGlobe acts as an archive in case the original files cannot be found for older atlases.

Longer term, we will find a solution for making the set of available atlases easier to navigate (e.g. atlas groups).

@carlocastoldi
Copy link
Contributor Author

Sure
Thank you for the explanation!

Copy link
Member

@alessandrofelder alessandrofelder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks again for this, @carlocastoldi - and thank you and @adamltyson for your clarifications.

The code looks largely good on a first pass (I haven't run it, but it's promising you could use it in ABBA 🙌 ), made two small comments.

Do either of you know how official the OneDrive folder is? It looks like in the publication they point to this Figshare repo. Using this repo instead would make it easy to use pooch to download and cache them neatly (and it's likely a more robust place to get data from long-term)?

@carlocastoldi
Copy link
Contributor Author

Do either of you know how official the OneDrive folder is?

Is the official download page from their website, apparently.

One that could get updated and be changed when new versions come out, I imagine.

@alessandrofelder
Copy link
Member

alessandrofelder commented Dec 20, 2024

Hmmm, I see.

I'd still prefer going with the figshare data if possible, because

  • it has a license
  • it is versioned by a DOI
  • it is linked to from the publication we include in the metadata
  • (it's easier to download programmatically... haven't found a way to easily generate a sensible download link from OneDrive)

I'm not sure how standardised new versions published on OneDrive would be?
Would they make a new folder somewhere? Would they overwrite some of the existing files?
I may be missing something.

What do you think @carlocastoldi ?

@carlocastoldi
Copy link
Contributor Author

Sure, i see all the upsides of using figshare. I guess I was worried that we wouldn't notice any update to the published atlas unless we actively search for them, but that's a minor issue!

I'll work on this PR by using pooch on Figshare. Thank you all!

@carlocastoldi carlocastoldi marked this pull request as ready for review December 20, 2024 19:53
@carlocastoldi
Copy link
Contributor Author

With this last commit, i think this PR is ready for review!

@carlocastoldi carlocastoldi changed the title Add DevCCFv1 P14 LSFM@20µm resolution Add DevCCFv1 LSFM@20µm resolution Dec 20, 2024
@adamltyson
Copy link
Member

Thanks @carlocastoldi. The BrainGlobe team is checked out for the holidays now, so we'll take a look in the new year.

@carlocastoldi
Copy link
Contributor Author

I figured!
I just wanted to have my part done before holidays. No hurry!

Happy holidays! ^^

Copy link
Member

@alessandrofelder alessandrofelder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for this @carlocastoldi - I've tested the script locally on E11 and it's looking great 🎉 :
image

I can also confirm that the generated atlas (at least at E11) passes all our validation functions successfully.

I have a major comment that should be straightforward to handle (🤞 - happy to help)

The global variables seem to imply that the script handles all MODALITIES, but LSFM is hard-coded later on and we only loop over timepoints. We should document clearly on how the script is intended to be run.

Maybe for this PR, just LSFM is enough?

If we wanted the other modalities (in this PR or as a separate issue for now) Maybe easiest for use is to
- run LSFM by default
- have an argument to specify the modality on the command line: like python kim_devccf_mouse.py --modality MRI-dwi
- replace hard-coded LSFM with modality (and assert modality in MODALITIES)

This would allow running atlas generation in parallel for each modality on our HPC (as separate SLURM jobs)

I have also suggested removing any obsolete (AFAICT) commented-out code.

Let me know your thoughts :)

Comment on lines 32 to 51
RESOLUTIONS = (20, 50)
MODALITIES = (
"LSFM", # Light Sheet Fluorescence Microscopy
"MRI-adc", # MRI Apparent Diffusion Coefficient
"MRI-dwi", # MRI Difusion Weighted Imaging
"MRI-fa", # MRI Fractional Anisotropy
"MRI-MTR", # MRI Magnetization Transfer Ratio
"MRI-T2", # MRI T2-weighted
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are never used AFAICT - what is the intention for them? IIUC they are possible other modalities and resolutions that we don't loop over in the current version... maybe we want them in a later version?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any reason why we can't just add them? We have all the modalities for the P56 atlas already.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy to add them, no reason why not - my question was more about what the intention of this PR is, because the current code puts the modalities into a list variable that is never used?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I meant - can we adapt this PR to include all modalities rather than just deleting this commented out list?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The existence of this list was intended as ~metadata of Kim's DevCCFv1 reference images, rather than for actual generation of the atlases.

The intent of this PR was to add LSFM only because:

  • its the most accurate and at highest resolution (20um vs 50 um of MRI images)
  • generating 7×6=42 atlases seemed too overwhelming for an user navigating the atlases through show_atlases() (et simil.) output. The atlases are 7, the only thing changing is the reference image. This, in my experience, creates a sort of "bias" and confusions.

That said I think i could do adapt the script to generate those atlases as well, if you confirm me you think it's appropriate! However, i would discourage from officially adding them to BrainGlobe, until #141 is not complete.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's appropriate. Ideally we would wait until #141 is complete, but unfortunately, if we treat #141 as a blocker, there's a lot of things we won't get done!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as you wish. I modified the script to allow other modalities as well.

Now the last thing remaining is the naming convention of the atlases. What do you prefer?

  1. kim_devccf_mouse_<MODALITY>_<AGE>
  2. kim_devccf_mouse_<AGE>_<MODALITY>

The 2nd option is probably better for navigation

@carlocastoldi
Copy link
Contributor Author

Thank you for the comments!

I still have some questions, tho. There are parameters I was a unsure of and for which i chose the default values, when generating the meshes. Could you please help me making and informed decision:

  • closing_n_iters
  • decimate_fraction
  • smooth. The resulting meshes retain the voxels shape, as we can also see from @alessandrofelder screenshot. Should we smooth them out? How do you evaluate the quality of the meshes?

Also: is it correct to use these same parameters for every age (and sizes)?
Lastly: should I add the possibility to generate the atlases for MRI@50µm reference images, could I still use the annotation volume at 20µm instead?

Thank you in advance!

@carlocastoldi
Copy link
Contributor Author

carlocastoldi commented Jan 9, 2025

with the last commit i set the atlas naming to kim_devccf_mouse_<AGE>_<MODALITY>.

Last thing to discuss is whether this PR should include the change of the naming to overlap previous kim atlases in brainglobe (i.e. be it kim_dev_mouse_<AGE>_<MODALITY>).
If not, i think this script is ready to be merged, on my side!

@alessandrofelder
Copy link
Member

Last thing to discuss is whether this PR should include the change of the naming to overlap previous kim atlases in brainglobe (i.e. be it kim_dev_mouse__).

I would be in favour of this last change, to make it clearer to BrainGlobe-only users that this is a newer version of the same atlas? @adamltyson may wish to chime in

@adamltyson
Copy link
Member

Yeah, if no other reason than it's a slightly shorter name.

@carlocastoldi
Copy link
Contributor Author

right.

Does that mean that I should set VERSION=3, since kim_dev_mouse is currently on version 2? And what about the old script? Should I delete it?

@adamltyson
Copy link
Member

Does that mean that I should set VERSION=3, since kim_dev_mouse is currently on version 2?

I think so

And what about the old script? Should I delete it?

I think leave it for now, we don't (yet) have a process for this.

@carlocastoldi carlocastoldi changed the title Add DevCCFv1 LSFM@20µm resolution Add all Kim's DevCCFv1 atlases Jan 10, 2025
@carlocastoldi
Copy link
Contributor Author

I found the issue I had when generating the meshes for embryonic mri atlases.

It was pooch having an unintuitive interface, in my opinion, extracting files from archives in a different order from the one you asked.
You might have seen it working on your OS cause it depends on how the filesystem lists files. Nonetheless it wouldn't have worked on a Linux system such as Slurm, i am afraid.

With this, i feel like this PR is finally concluded! 🎉

@alessandrofelder
Copy link
Member

alessandrofelder commented Jan 14, 2025

Thanks again for this @carlocastoldi - I am happy to say that I was able to regenerate all atlases locally and they pass our validation tests 🎉

Upon visual inspection though, it look like all MRI reference images are converted to tiff badly. An example (E11.5, MRI-adc):

After loading the reference image with load_nii, we get:

# on MRI-adc data at E11.5
(Pdb) reference.dtype
dtype('float64')
(Pdb) np.max(reference)
0.024380000308156013
(Pdb) np.min(reference)
0.0

reference gets naively converted to uint16 during wrapup, so all values become 0.
So I think we want to rescale the MRI data to the range [0, 2**16-1], like we do elsewhere. (we can rescale the LSFM data too for simplicity in the script, as it won't make a difference)

Does that make sense?

PS this seems to have been done in the LSFM data already, at least I checked for E11.5 programmatically (and visually they all look fine!)

# on LSFM data at E11.5
(Pdb) import numpy as np
(Pdb) np.max(reference)
65535.0
(Pdb) np.min(reference)
0.0
(Pdb) reference.dtype
dtype('float64')

I can take this on if you don't have time @carlocastoldi - just let me know :)

@carlocastoldi
Copy link
Contributor Author

thanks a lot @alessandrofelder for identifying the problem and pointing me to the correct solution

@carlocastoldi
Copy link
Contributor Author

I added a --cached-meshes option to the script that may speed up the generation of the atlases using the same meshes (i.e. for computing meshes only once). It requires the corresponding atlas of the cached meshes to be completed, however.

Example
first run:

python brainglobe_atlasapi/atlas_generation/atlas_scripts/kim_devccf_mouse.py -t E11.5 -m MRI-adc

and only after it finished, you can run any of:

python brainglobe_atlasapi/atlas_generation/atlas_scripts/kim_devccf_mouse.py -t E11.5 -c MRI-adc -m MRI-dwi
python brainglobe_atlasapi/atlas_generation/atlas_scripts/kim_devccf_mouse.py -t E11.5 -c MRI-adc -m MRI-fa
python brainglobe_atlasapi/atlas_generation/atlas_scripts/kim_devccf_mouse.py -t E11.5 -c MRI-adc -m MRI-MTR
python brainglobe_atlasapi/atlas_generation/atlas_scripts/kim_devccf_mouse.py -t E11.5 -c MRI-adc -m MRI-T2

@alessandrofelder
Copy link
Member

Thanks @carlocastoldi - I will have time to try to regenerate the MRI atlases tomorrow.

@alessandrofelder
Copy link
Member

Atlases generated locally, I will put them on GIN today - thanks again @carlocastoldi, much appreciated!

@alessandrofelder alessandrofelder merged commit 79c5878 into brainglobe:main Jan 17, 2025
12 of 14 checks passed
@carlocastoldi
Copy link
Contributor Author

I am sorry to come back to this, but does this work for you?

import brainglobe_atlasapi
brainglobe_atlasapi.BrainGlobeAtlas("kim_dev_mouse_p04_lsfm_20um")
brainglobe_atlasapi.BrainGlobeAtlas("kim_dev_mouse_P04_LSFM_20um") # also fails to download & extract for me

I then run that same command in debug mode and, before running tar = tarfile.open(destination_path), i swapped the atlas.tar.gz file with the manually downloaded archive from your Gin repo.
BrainGlobe had finally extracted the archive and created ~/.brainglobe/kim_dev_mouse_P04_LSFM_20um_v1.3.

However, brainglobe_atlasapi.show_atlases() still did not show the atlas. I had to rename all-lowercase the folder of the atlas (i.e. ~/.brainglobe/kim_dev_mouse_p04_lsfm_20um_v1.3). Only then BrainGlobe could use the ~downloaded atlas.

@adamltyson
Copy link
Member

Thanks @carlocastoldi, due to this bug, atlas names, and the corresponding entries in the list here must all be lowercase.

@alessandrofelder - might be better to update the script, then re-run all the atlas generation, rather try to rename everything (and the many errors that could occur).

@PolarBean
Copy link
Contributor

PolarBean commented Jan 23, 2025

hey I'm just trying to figure out how to update the README to include this update. The issue i'm having is up until now I could just list the API name of each atlas (ie: kim_dev_mouse_P04_LSFM_20um), but with this update there are now too many combinations to list all. instead of kim_dev_mouse_P04_LSFM_20um should I say kim_dev_mouse_{age}_{modality}_{resolution}um? thanks!

@PolarBean
Copy link
Contributor

maybe better to discuss over at #444

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants