Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use of Dimensions with Molecular entities - Proposal #20

Open
proccaserra opened this issue Jul 24, 2018 · 1 comment
Open

use of Dimensions with Molecular entities - Proposal #20

proccaserra opened this issue Jul 24, 2018 · 1 comment

Comments

@proccaserra
Copy link
Contributor

@jonathancrabtree, @agbeltran following-up on our call, I am logging an issue but it is a point of discussion regarding the use of DATS.Dimension with DATS.MolecularEntity in the context of MGI dataset.

  1. reserve use of DATS.dimension to cover variables which can be measured about the molecular entity (e.g. 'molecular weight', 'pKa', concentration) as per the definition of DATS.dimension (A feature of an entity, i.e. an individual measurable property (both quantitative or qualitative) of the entity being observed.)

  2. extend Molecular Entity properties to include a 'location' information akin to DATS.Material.spatialCoverage.

rationale:

  • allow the reporting of genomic location in a referential coordinate system.
    this would possibly clarify how to deal with 'Chromosome','start_coordinate','end_coordinate'
  • allow the specification of an anatomical location / cellular component localization.

This proposal stems from the discussion we had about consistent use of object properties (either DATS.Dimension or DATS.extraProperties).

Such modification/extension to DATS.MolecularEntity would help clarify / refine the ER diagram you pushed the other day.

@cmungall
Copy link

cmungall commented Aug 3, 2018

Given the alliance data isn't typically about individual organisms or measurements, a measurement based model doesn't really make sense (there is some data at this level of granularity at some of the MODs, but this is not typical).

However, I can see the value of a generic datamodel where we have entities with arbitrary properties ("Dimensions"). This seems quite powerful for modeling arbitrary outputs of analysis programs which are typically tabular or vector oriented. But it seems it may be limited for a heavily normalized (in the Codd sense) knowledge resource like a model organism database. And I don't really understand where things like extraProperties come in. When is a property extra?

I think therefore I am tending towards option 2. I'm not quite sure how best to implement. Only genomic entities will have chromosome base pair range localizations. Actual gene products and molecular complexes have subcellular localizations. But it seems like a start to move in this direction.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants