-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Requesting feedback on mapping files #24
Comments
I looked at AGR_FB_Mapping (I assume the other MOD files are identical since the format is identical across the Alliance files). I previously expressed some of my concerns here: dcppc/crosscut-metadata#21 I'm not sure DATS Dimension model is appropriate for representing the Alliance data. Even when representing basic gene information the mapping is lossy. For example, multiple fields with distinct semantics (displayName, prefix, localid) are all mapped to relatedIdentifiers. On row 22, type is mapped to title - is this a mistake It looks like the ortholog mapping is lossy , it's not clear how a homology could be performed on the transformed data It may be the case that I misunderstanding the mappings file. Is there an example DATS JSON file, that would really help. |
Thanks! Can you provide URLs to get to the JSON? |
Sure, there you go, MGI sample DATS JSON |
Thanks again! It looks like things are not being mapped at the correct level. For example,
Zfp58 is the gene symbol, not the the name of the taxon (which should be Mus musculus). Similarly the RIKEN identifiers are at the level of the gene not the taxon. for the identifiers, there are things like
I suggest having a single canonical identifier and using a CURIE such as MGI:99205, facilitating JSONLD->RDF using a canonical context file I'm looking for the homology information, it seems to be embedded inside Material objects:
I don't really know what a Material is here, or what the list of values is intended to represent. Overall I'm still not quite sure I grok the datamodel. Each gene is modeled as a I'm trying to map this all onto my own mental map of biology and not having much luck. |
Hi, Would it be possible to use the compact identifiers [1] form for all the identifiers? This means you will also be able to resolve the identifiers using identifiers.org or n2t (KC2, team Sodium work). Cheers, |
@cmungall thanks for looking into this. We will review and get back to you soon. |
I'm sorry, I'm just seeing this issue. What's the best way for me/TOPMed to get more context about this? I'm not sure what we're reviewing for, or who the best person would be. |
@bheavner Please let the Oxygen team (Anu) know if you would like to get additional information about the mapping process. |
The Oxygen team has generated mapping files for mapping the metadata to the crosscut metadata model aka DATS model. We would like to request feedback and engage in a discussion on improving the mappings. The PR for the mapping files is dcppc/crosscut-metadata#22
The text was updated successfully, but these errors were encountered: