Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat(csparc2star): Add support for multi volume groups in CS v4.5+ #1

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

valentinp
Copy link

@valentinp valentinp commented May 1, 2024

To test, you can run: pyem/csparc2star.py J9_00006_particles.cs J9_passthrough_particles_all_classes.cs particles.star where J9 is a classification job. This should work with either v4.4 or v4.5 outputs.

@valentinp valentinp requested review from alipunjani and mcleanm and removed request for alipunjani May 1, 2024 21:08
@valentinp
Copy link
Author

valentinp commented May 6, 2024

Daniel DM draft:
(@alipunjani to review after v4.5 release)

Hey Daniel (DanielAsarnow),

Hope you're doing well and thank you for your ongoing efforts with pyem! In CryoSPARC v4.5 we've modified the way we handle 3D multi-class alignment meta data to (hopefully) make working with classification outputs easier, especially when dealing with many classes. Namely, we've introduced a new dataset column type called alignments3D_multi that permits dynamically-sized arrays, obviating the need for K different prefixes for K classes. For example, alignments3D_multi/class_posterior is now an N x K matrix that replaces {alignments_class_k/class_posterior} for k = 0 to K-1. All jobs that output classes in 3D now use alignments3D_multi exclusively as of v4.5.

We have some documentation about this change in a new accompanying utility job [LINK].

To ensure csparc2star.py works with this new convention, we put together a (relatively short) PR for pyem here: [LINK]. We've done some preliminary testing on our end to make sure everything works as expected, but please let me know if you have any questions / concerns on your end! Hopefully this is a straightforward merge!

Cheers,
Valentin

Copy link

@mcleanm mcleanm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR looks good to me. A couple notes:

  • csparc2star appears to only read alignment-type fields (2D or 3D) from the first cs file in the list, i.e., not from any of the passthroughs
  • with this edit, the output star file's model parameters are read from one of the following slots, in order of descending preference:
    • alignments3D
    • alignments3D_multi
    • alignments_class_X or alignments_class3D_X (whichever is in the first cs file)
  • Thus:
    • For jobs that output both alignments3D and alignments3D_multi (right now: Regroup 3D and Align 3D), csparc2star will write model parameters from alignments3D. This is fine because these jobs keep this updated with the multi
    • For jobs that only output alignments3D_multi, (e.g. hetero refine, multi-class ab-initio, 3D classification, etc.), csparc2star will write model parameters from alignments3D_multi.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants