Skip to content

Latest commit

 

History

History
69 lines (40 loc) · 2.94 KB

index.md

File metadata and controls

69 lines (40 loc) · 2.94 KB
layout
default

Build Status

Biolink Model

A high level datamodel of biological entities (genes, diseases, phenotypes, pathways, individuals, substances, etc) and their associations.

One of the main uses of the model is as a way of standardizing types and relational structures in knowledge graphs (KGs), where the KG may be either a property graph or RDF.

The schema is expressed as a yaml file, which is translated into:

Datamodel

The schema assumes a property graph, where nodes represent individual entities, and edges represent associations between nodes. The biolink model provides a schema for both nodes and edges.

Nodes

  • named thing - all nodes are a sub class of 'named thing'

Edges

  • association - all edges are a sub class of 'association'

Slots

  • Slots - slots are used to collectively refer to, both, node and edge properties.
    • node property - all node properties are a sub class of 'node property'
    • association slot - all edge properties are a sub class of 'association slot'

See the Datamodel index for a list nodes, edges, and slots.

Identifiers

See biolink json-ld context to see CURIE prefix mappings.

The includes prefix expansions such as:

  "CHEBI": "http://purl.obolibrary.org/obo/CHEBI_",
  "NCBIGene": "http://www.ncbi.nlm.nih.gov/gene/",
  "NCIT": "http://purl.obolibrary.org/obo/NCIT_",

Following the JSON-LD context standard.

Note that we do not curate these in biolink. Rather we take these from upstream sources, via prefixcommons biocontexts. We specify a priority order of upstream sources in cases where conflicts may occur. See the default_curi_maps tag at the top of the biolink-model.yaml file. We also specify a small set of top-level overrides via the prefixes tag at the top.

BioLink model representation

Neo4J representation

See mapping to neo4j

RDF representation

See mapping to RDF