-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve annotation of cross-references #103
Comments
CC @yongqunh |
The [Cell Line Ontology (CLO)](https://bioregistry.io/registry/clo) is a detailed resource, however it does not follow standard OBO modeling pattern for cross-references that either a predicate from [SKOS](https://bioregistry.io/skos) or `oboInOwl:hasDbXref` to point to a single CURIE encoded as a string. Instead, it uses `rdfs:seeAlso` with a combination of non-standard CURIEs that are either comma or semi-colon delimited. Depends on: - biopragmatics/bioregistry#896 - CLO-ontology/CLO#103
@cthoyt That's great! We would like to get your contribution. Sorry for the delayed reply since I just came back from a two-week travel. We have recently had Dr. Jie Zheng join our group. I would suggest having her involved as well. Do you have a time for a meeting on this? |
Hi @yongqunh and @zhengj2007, I am also just back from vacation and at a project meeting this week. Next week I am relatively free and on east coast time. It would be great to plan a video conference. For me, around 11AM is best, but I'm flexible. My email is [email protected] if you would rather coordinate on that channel, too (or on slack) |
I am available around 11:00 EST Monday, Tuesday, and Thursday next week. It would be great to discuss it with you. @cthoyt |
See related discussion on what annotation property should be used to indicate the mapped/equivalent terms |
I found that CLO makes cross-references using the
rdfs:seeAlso
predicate instead of something a bit more fit-for-purpose likeoboInOwl:hasDbXref
orskos:exactMatch
.I also found that these
rdfs:seeAlso
annotations point to strings that potentially contain multiple CURIEs, with varying degrees of heterogeneity in how they're written.Can you give a bit of insight into why the cross-references were encoded this way?
I wrote a script that attempts to parse and standardize them using the Bioregistry. I posted the output in an SSSOM file in this gist. Would you be interested for me to contribute these back in a more standardized way to CLO?
The text was updated successfully, but these errors were encountered: