Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Per element metadata #745

Open
ivirshup opened this issue Mar 23, 2022 · 1 comment
Open

Per element metadata #745

ivirshup opened this issue Mar 23, 2022 · 1 comment

Comments

@ivirshup
Copy link
Member

It would be nice to have metadata per element in the anndata object. That is, X and each layer could have metadata attached.

  • All of our on disk formats support this
  • R generally supports this
  • numpy arrays/ sparse arrays do NOT support this

Use cases

  • Storing provenance information on the objects
  • Storing colors with a categorical labelling
  • Storing parameters of methods
  • Storing information on the semantic meaning of an array. E.g. “this is coordinate”, “this is a probabilistic labelling”

Possible solutions

Metadata is stored separately from the elements

The anndata object could carry around the metadata, not the elements themselves. This could be implemented as a kind of “shadow” of the object which only contains metadata at the same paths. This is not so different from storing metadata in uns, but it has “path based” access, which could be easier to keep up to date.

Every element carries around it’s own metadata

This works for some objects. pandas DataFrames and xarray objects have metadata slots on them. However numpy arrays and basic python objects do not.

Two implementation strategies immediately come to mind here:

Use xarray for everything

We could try and use xarray for everything. But there are some incompatibilities there we’d have to work out #744

Custom classes 😰

We could also make custom subclasses for all the element types we want to support (that don’t already cover this). But subclassing pydata objects is scary.

Alternatives

  • Metadata could all be stored in uns, basically how we do it right now.
  • Metadata could all be stored externally through a logging system
    • Other frameworks would have to adopt this
    • Ideally metadata could be transferred in the same object as the data. We'd need a way of encoding this logs.
@ivirshup
Copy link
Member Author

Some concrete use cases include:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant