Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How is Upheno being used in KG-Phenio? #40

Open
caufieldjh opened this issue May 23, 2022 · 5 comments
Open

How is Upheno being used in KG-Phenio? #40

caufieldjh opened this issue May 23, 2022 · 5 comments

Comments

@caufieldjh
Copy link
Contributor

How does Upheno relate to our target MP and HP phenotypes? How may it help to connect them (and hence give our models a way to learn connections between them)?

We currently have a set of Upheno gold-standards and they all kind of look like this:

~/kg-phenio/data$ head transformed/upheno_mapping/upheno_mapping_all_edges.tsv
id      subject predicate       object  category        relation
uuid:0749384c-aadd-11ec-b5d8-00155d77d051       HP:0000347      Biolink:same_as MP:0002639      biolink:Association     skos:exactMatch

but these are not included in the merge.

The input monarch-merged.owl imports and frequently mentions Upheno:

~/kg-phenio/data$ grep UPHENO raw/monarch-merged.owl | wc -l
24254

Many of these are in axioms with Uberon:

$ grep UPHENO -a2 raw/monarch-merged.owl | grep UBERON | wc -l
8338

In the KGX transform, we have just a few UPHENO types:

~/kg-phenio/data$ grep UPHENO merged/merged-kg_nodes.tsv
UPHENO:0001001  biolink:NamedThing      Phenotype                       Graph                                                                                                                                                              1. From OGMS: A (combination of) quality(ies) of an organism determined by the interaction of its genetic make-up and environment that differentiates specific instances of a species from other instances of the same species (from OGMS, and used in OBI, but treatment as a quality is at odds with previous OBI discussions and their treatemnt of 'comparative phenotype assessment, where a phenotype is described as a quality or disposition)  2. From OBI calls: quality or disposition inheres in organism or part of an organism towards some growth environment                                                                                      Stub node that gathers root classes from various taxon-specific phenotype ontologies, as connectors to bringing classes from these ontolgies into the GENO framework.                                                                                                                                          owl:Class
UPHENO:0000001  biolink:NamedThing      has phenotype affecting                 Graph                                                                                                                                                      owl:ObjectProperty
UPHENO:0000504  biolink:NamedThing      absent feather barbule                  Graph                                                                                                                                                      owl:Class
UPHENO:0000502  biolink:NamedThing      curved upper beak                       Graph                                                                                                                                                      owl:Class
UPHENO:0000503  biolink:NamedThing      short upper beak                        Graph                                                                                                                                                      owl:Class
UPHENO:0000501  biolink:NamedThing      accelerated feather growth                      Graph                                                                                                                                              owl:Class
UPHENO:0000506  biolink:NamedThing      increased filoplume feather length                      Graph                                                                                                                                      owl:Class
UPHENO:0000505  biolink:NamedThing      decreased wing length                   Graph                                                                                                                                                      owl:Class

and not many edges:

~/kg-phenio/data$ grep UPHENO merged/merged-kg_edges.tsv
HP:0000118-biolink:subclass_of-UPHENO:0001001   HP:0000118      biolink:subclass_of     UPHENO:0001001          rdfs:subClassOf Graph
MP:0000001-biolink:subclass_of-UPHENO:0001001   MP:0000001      biolink:subclass_of     UPHENO:0001001          rdfs:subClassOf Graph
FBcv:0001347-biolink:subclass_of-UPHENO:0001001 FBcv:0001347    biolink:subclass_of     UPHENO:0001001          rdfs:subClassOf Graph
XPO:00000000-biolink:subclass_of-UPHENO:0001001 XPO:00000000    biolink:subclass_of     UPHENO:0001001          rdfs:subClassOf Graph
ZP:00000000-biolink:subclass_of-UPHENO:0001001  ZP:00000000     biolink:subclass_of     UPHENO:0001001          rdfs:subClassOf Graph
OBO:MGPO_0001001-biolink:subclass_of-UPHENO:0001001     OBO:MGPO_0001001        biolink:subclass_of     UPHENO:0001001          rdfs:subClassOf Graph
UPHENO:0001001-biolink:subclass_of-BFO:0000020  UPHENO:0001001  biolink:subclass_of     BFO:0000020             rdfs:subClassOf Graph
UPHENO:0001001-biolink:subclass_of-PATO:0000001 UPHENO:0001001  biolink:subclass_of     PATO:0000001            rdfs:subClassOf Graph
urn:uuid:45ada42f-9e55-4a6c-8bc6-0a40e365ced9   GENO:0000833    biolink:object  UPHENO:0001001  biolink:Association     OBAN:association_has_object     Graph   owlstar:AllSomeInterpretation
GENO:0000575-biolink:subclass_of-UPHENO:0001001 GENO:0000575    biolink:subclass_of     UPHENO:0001001          rdfs:subClassOf Graph
OBO:PHIPO_0000505-biolink:subclass_of-UPHENO:0001001    OBO:PHIPO_0000505       biolink:subclass_of     UPHENO:0001001          rdfs:subClassOf Graph
WBPhenotype:0000886-biolink:subclass_of-UPHENO:0001001  WBPhenotype:0000886     biolink:subclass_of     UPHENO:0001001          rdfs:subClassOf Graph
OBO:PLANP_00000000-biolink:subclass_of-UPHENO:0001001   OBO:PLANP_00000000      biolink:subclass_of     UPHENO:0001001          rdfs:subClassOf Graph
OBO:DDPHENO_0010000-biolink:subclass_of-UPHENO:0001001  OBO:DDPHENO_0010000     biolink:subclass_of     UPHENO:0001001          rdfs:subClassOf Graph
RO:0002200-rdfs:range-UPHENO:0001001    RO:0002200      rdfs:range      UPHENO:0001001          rdfs:range      Graph
RO:0000052-biolink:subclass_of-UPHENO:0000001   RO:0000052      biolink:subclass_of     UPHENO:0000001          rdfs:subPropertyOf      Graph
RO:0002502-biolink:subclass_of-UPHENO:0000001   RO:0002502      biolink:subclass_of     UPHENO:0000001          rdfs:subPropertyOf      Graph
RO:0002314-biolink:subclass_of-UPHENO:0000001   RO:0002314      biolink:subclass_of     UPHENO:0000001          rdfs:subPropertyOf      Graph

Same_as predicates connecting HP and MP may not use Upheno:

~/kg-phenio/data$ grep MP:0000414 merged/merged-kg_edges.tsv | grep biolink:same_as
HP:0001596-biolink:same_as-MP:0000414   HP:0001596      biolink:same_as MP:0000414              owl:equivalentClass     Graph

Another example:

~/kg-phenio/data$ grep HP:0011599 merged/merged-kg_edges.tsv | grep biolink:same_as
HP:0011599-biolink:same_as-MP:0000650   HP:0011599      biolink:same_as MP:0000650              owl:equivalentClass     Graph

So should we expect to see more paths between HP and MP nodes c/o Upheno and KGX is just ignoring them, or is this the best we can get?

@justaddcoffee
Copy link

Thanks Harry for the detective work!

It seems like the reason Luca is observing that topology of kg-phenio isn't helping with HP-MP link prediction is that we are not merging Upheno data into the merged kg-phenio graph. Maybe Sierra or others on the KGX team can help us understand why?

@caufieldjh
Copy link
Contributor Author

I'm not sure it's all in the original monarch-merged.owl to begin with:

~/kg-phenio/data$ grep UPHENO raw/monarch-merged.owl | sort | uniq
                    <rdf:Description rdf:about="http://purl.obolibrary.org/obo/UPHENO_0001001"/>
                <owl:onProperty rdf:resource="http://purl.obolibrary.org/obo/UPHENO_0000001"/>
                <owl:someValuesFrom rdf:resource="http://purl.obolibrary.org/obo/UPHENO_0001001"/>
            <rdf:Description rdf:about="http://purl.obolibrary.org/obo/UPHENO_0000001"/>
        <rdfs:range rdf:resource="http://purl.obolibrary.org/obo/UPHENO_0001001"/>
        <rdfs:subClassOf rdf:resource="http://purl.obolibrary.org/obo/UPHENO_0001001"/>
        <rdfs:subPropertyOf rdf:resource="http://purl.obolibrary.org/obo/UPHENO_0000001"/>
    <!-- http://purl.obolibrary.org/obo/UPHENO_0000001 -->
    <!-- http://purl.obolibrary.org/obo/UPHENO_0000501 -->
    <!-- http://purl.obolibrary.org/obo/UPHENO_0000502 -->
    <!-- http://purl.obolibrary.org/obo/UPHENO_0000503 -->
    <!-- http://purl.obolibrary.org/obo/UPHENO_0000504 -->
    <!-- http://purl.obolibrary.org/obo/UPHENO_0000505 -->
    <!-- http://purl.obolibrary.org/obo/UPHENO_0000506 -->
    <!-- http://purl.obolibrary.org/obo/UPHENO_0001001 -->
    <owl:Class rdf:about="http://purl.obolibrary.org/obo/UPHENO_0000501">
    <owl:Class rdf:about="http://purl.obolibrary.org/obo/UPHENO_0000502">
    <owl:Class rdf:about="http://purl.obolibrary.org/obo/UPHENO_0000503">
    <owl:Class rdf:about="http://purl.obolibrary.org/obo/UPHENO_0000504">
    <owl:Class rdf:about="http://purl.obolibrary.org/obo/UPHENO_0000505">
    <owl:Class rdf:about="http://purl.obolibrary.org/obo/UPHENO_0000506">
    <owl:Class rdf:about="http://purl.obolibrary.org/obo/UPHENO_0001001">
    <owl:ObjectProperty rdf:about="http://purl.obolibrary.org/obo/UPHENO_0000001">

i.e., we expect to see more A sameAs UPHENO sameAs B but there just aren't that many distinct UPHENO classes imported to begin with.

@justaddcoffee
Copy link

@caufieldjh thanks! I just discussed this briefly with Chris, and we think what might be called for here is adding some functionality to KGX to preserve subq phenotype patterns when converting OWL to KGX.

We can hack on this tomorrow on the NEAT call from 9:30 - 10:00 am PT to make a PR for KGX to support this.

@cmungall
Copy link

Integration test:

  1. there should be some kgx edge with subject HP:0002088 and object UBERON:0002048
  2. in uPheno2 there is should be an is-a edge between HP:0002088 and UPHENO:0019970

I think 2 will work just fine. For 1 ideally these edges would be supplied upstream but in the absence of this I think a procedural transform in kgx or oak is best.

@matentzn
Copy link

I think adding code to KGX has great potential for confusion; It is not at all easy to extract the relationships correctly from subq, unless you do something trivial like creating associative edges with no semantics based on the signature of the EQ ("if limb is mentioned, its associated"). The better way would be to apply some pressure on this pull request:

mgijax/mammalian-phenotype-ontology#3480

Once we can convince @sbello to add this, we can convince all the other phenotype ontologies, and once it is in all phenotype ontologies apart from HP, we can get try to get it into HP. If we cant get it into HP, we can still add a release artefact that is hp.owl + relations, published by hp, to cater such cases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants