Decouple annotations/associations from the main release obo/owl files #13

cmungall · 2024-02-01T02:53:25Z

Current pyobo includes annotations (in the sense of GO annotations, not OWL annotations) modeled as relationships (i.e S subClassOf R some O).

An example of this is ec.obo:

[Term]
id: eccode:1.1.1.1
name: alcohol dehydrogenase
is_a: eccode:1.1.1 ! With NAD(+) or NADP(+) as acceptor
relationship: RO:0002327 GO:0004022 ! enables alcohol dehydrogenase (NAD+) activity
relationship: RO:0002351 uniprot:A0A0H2URT2 ! has member ADHE_STRPN
relationship: RO:0002351 uniprot:A0A0H2ZM56 ! has member ADHE_STRP2
[many rows deleted]

This has a number of practical and semantic disadvantages

It bloats the size (ec.obo is 14x bigger with relationships)
Danger of ontological errors (real: the composed products will simply not work in OWL environments unless everything is modeled just so)
Lack of modularity / Harder to recompose into application-specific products (e.g. what if I want EC + just human proteins)
product becomes stale sooner
lack of separation of concerns
For associations it's important to have evidence, provenance. While this can be done with ontology formats using axiom annotation, this can get bulky and awkward. A TSV is simpler and better often
Directionality issues (are links to EC distributed with uniprot? links to uniprot distributed with EC? both?)
Shoreline issues (ec.obo includes all swissprot annotations, but not, say an arguably more useful set like reference proteomes for core species. Why?)
It's broadly understood that distributing annotations and "contingent knowledge" in the ontology and in models like OWL is not a good strategy, see e.g https://doi.org/10.1016/j.yjbinx.2019.100002. See also slides 51 onwards

Instead decouple the associations / annotations / contingent knowledge. Use TSVs without OWL semantics and all its pitfalls. KGX is a good choice. Some associations are better modeled as SSSOM. By all means distribute these as .obo/.owl as well, and by all means distribute merged products too. The key is to focus on the "conceptual coat hanger" as Rector calls it, and allow people to hang their coats as they please.

In practical terms something like this:

This is less work for pyobo/obo-db-ingest overall. Sometimes you can simply say "we are only providing the coat rack today, we may get to the associations later"

The text was updated successfully, but these errors were encountered:

cmungall · 2024-11-02T01:23:34Z

This is still a major impediment to reusing the fantastic work in obo-db-ingest.

E.g. here is the latest rhea ingest

References #170

References https://github.com/biopragmatics/pyobo/issues/170 and biopragmatics/pyobo#202

@cmungall

References biopragmatics/obo-db-ingest#13 Demonstration of results are in biopragmatics/obo-db-ingest#12 This PR enables serializing to OBO but skipping object properties, as requested by @cmungall

cmungall mentioned this issue Feb 1, 2024

Improve Rhea import biopragmatics/pyobo#168

Merged

cthoyt added the help wanted Extra attention is needed label Mar 24, 2024

cmungall mentioned this issue Mar 25, 2024

possibilities to extend content for OBODB UniProt biopragmatics/pyobo#172

Closed

cmungall mentioned this issue Apr 24, 2024

Record the fact that a statement is somehow auto-generated? information-artifact-ontology/ontology-metadata#172

Open

cthoyt referenced this issue in biopragmatics/pyobo Nov 4, 2024

Enable outputting slim OBO

6e462af

References #170

cthoyt mentioned this issue Nov 4, 2024

Enable outputting slim OBO biopragmatics/pyobo#202

Merged

cthoyt referenced this issue Nov 4, 2024

Add slim EC

4bd4282

References https://github.com/biopragmatics/pyobo/issues/170 and biopragmatics/pyobo#202

cthoyt mentioned this issue Nov 4, 2024

Demonstrating adding slim version of EC #12

Open

cthoyt transferred this issue from biopragmatics/pyobo Nov 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Decouple annotations/associations from the main release obo/owl files #13

Decouple annotations/associations from the main release obo/owl files #13

cmungall commented Feb 1, 2024 •

edited

Loading

cmungall commented Nov 2, 2024

Decouple annotations/associations from the main release obo/owl files #13

Decouple annotations/associations from the main release obo/owl files #13

Comments

cmungall commented Feb 1, 2024 • edited Loading

cmungall commented Nov 2, 2024

cmungall commented Feb 1, 2024 •

edited

Loading