The Reactome Graph Project models the Reactome knowledgebase into a interconnected graph.
At the cellular level, life is a network of molecular reactions. In Reactome, these processes are systematically described in molecular detail to generate an ordered network of molecular transformations (Fabregat et al. 2015). This amounts to millions of interconnected terms naturally forming a graph of biological knowledge. The Reactome Graph aims to provide an intuitive way for data retrieval as well as interpretation and analysis of pathway knowledge.
Retrieving, and especially analyzing such complex data becomes tedious when using relational Databases. Queries across the pathway knowledge base are composed by a number of expensive join operations resulting in bad performance and a hard-to-maintain project. Due to the schema based approach, relational databases are limited on how information will be stored and thus are difficult to be scaled for new requirements. In order to overcome these problems the Reactome database is imported in Neo4j, creating one big interconnected graph. The Graph database technology is an effective tool for modelling highly connected data. Since molecular networks are by their nature represented as a graph, storing Reactome data is in many ways beneficial: Firstly no normalization is required, data can be stored in its natural form. Secondly, nodes in the vicinity of a starting point can quickly be traversed, giving the user the possibility to not only retrieve data but perform fast analysis of these neighbor networks. Thus, knowledge can be retrieved that previously was not available due to the limitations of the relational data storage.
In the Reactome Graph project Neo4j, a Java implemented, open source, transactional database, with native graph storage and processing, is utilized. Neo4j offers multiple integration possibilities for Java development. In this project, String Data Neo4j provides automatic object graph mapping on top of Neo4j and integrates tightly with other parts of the spring framework used across the project.
- Spring Data Neo4j - version: 4.1.11.RELEASE
- AspectJ
The project can be used as a "Core" library providing a data access layer on top of the Neo4j Db. To get started using the Graph Core, please choose one of the following options:
- Create new Java Standalone Project using Maven Archetype
- Add in an existing Java Standalone Project
- Add in a Spring based project
We provide a Maven archetype as an easy handy way to create and set up a project that access the Reactome Graph Database using the graph-core. Please, go to Graph Archetype page.
mvn archetype:generate \
-DarchetypeGroupId=org.reactome.maven.archetypes \
-DarchetypeArtifactId=graph-archetype \
-DarchetypeVersion=1.0.0 \
-DgroupId=<your_group_id> \
-DartifactId=<your \
-Dversion=1.0.0-SNAPSHOT
-DarchetypeRepository=http://www.ebi.ac.uk/Tools/maven/repos/content/repositories/pst-release/
Dependency
<!-- http://www.ebi.ac.uk/Tools/maven/repos/content/repositories/pst-release/org/reactome/server/graph/graph-core/ for updates -->
<dependency>
<groupId>org.reactome.server.graph</groupId>
<artifactId>graph-core</artifactId>
<version>1.1.18</dependency>
Repository
<!-- EBI repo -->
<repository>
<id>pst-release</id>
<name>EBI Nexus Repository</name>
<url>http://www.ebi.ac.uk/Tools/maven/repos/content/repositories/pst-release</url>
</repository>
Properties
Connection to the Neo4j database can be configured using a properties file called ogm.properties
.
The file can be added in ${project}/src/main/resources
driver=org.neo4j.ogm.drivers.http.driver.HttpDriver
URI=http://user:password@localhost:7474
Following good practices, do no store passwords directly into a properties file. A Maven profile is strongly recommended and let Maven filter the resources into the properties file.
driver=org.neo4j.ogm.drivers.http.driver.HttpDriver
URI=http://${neo4j.user}:${neo4j.password}@${neo4j.host}:${neo4j.port}
To provide properties programmatically, without the file Neo4jConfig.class of the Batch importer has to be overwritten and System properties have to be set:
@org.springframework.context.annotation.Configuration
@ComponentScan( basePackages = {"org.reactome.server.graph"} )
@EnableTransactionManagement
@EnableNeo4jRepositories( basePackages = {"org.reactome.server.graph.repository"} )
@EnableSpringConfigured
public class MyNeo4jConfig extends Neo4jConfig {
private SessionFactory sessionFactory;
@Bean
public Configuration getConfiguration() {
Configuration config = new Configuration();
config.driverConfiguration()
.setDriverClassName("org.neo4j.ogm.drivers.http.driver.HttpDriver")
.setURI("http://".concat(System.getProperty("neo4j.host")).concat(":").concat(System.getProperty("neo4j.port")))
.setCredentials(System.getProperty("neo4j.user"),System.getProperty("neo4j.password"));
return config;
}
@Override
@Bean
public SessionFactory getSessionFactory() {
if (sessionFactory == null) {
logger.info("Creating a Neo4j SessionFactory");
sessionFactory = new SessionFactory(getConfiguration(), "org.reactome.server.graph.domain" );
}
return sessionFactory;
}
}
Afterwards initialise the Reactome Graph Connection and get access the API
public static void main(String[] args) {
// Initialising ReactomeCore Neo4j configuration
ReactomeGraphCore.initialise("localhost", "7474", "user", "password", MyNeo4jConfig.class);
// Instanciate our services
GeneralService genericService = ReactomeGraphCore.getService(GeneralService.class);
DatabaseObjectService databaseObjectService = ReactomeGraphCore.getService(DatabaseObjectService.class);
SpeciesService speciesService = ReactomeGraphCore.getService(SpeciesService.class);
SchemaService schemaService = ReactomeGraphCore.getService(SchemaService.class);
...
}
Dependency
See above
Repository
See above
Enabling graph-core Library
To enable the Graph Core library, when working with Spring, simply add the the following component-scan package to your mvc-dispatcher-servlet.xml
.
<context:component-scan base-package="org.reactome.server" />
Afterwards services can be autowired:
@Autowired
private DatabaseObjectService service;
Lazy loading configuration
Read more about AspectJ and AOP
The Graph core library provides lazy loading functionality. When using a getter of a relationship property, Objects will be loaded automatically if they are present in the Graph database. To deactivate this feature, which might be required in some cases: e.g. serializing results to json, add the following bean to your configuration mvc-dispatcher-servlet.xml
.
<bean class="org.reactome.server.graph.aop.LazyFetchAspect" factory-method="aspectOf">
<property name="enableAOP" value="false"/>
</bean>
The application follows a basic spring multi-tier architecture:
- Presentation Layer - Outside of the scope of this project. The projects acts as a data access layer for the content service. To see details about the content service goto https://github.com/reactome/content-service
- Service Layer - Service and Spring Transactions.
- Persistence Layer - Provided by SpringDataNeo4j repositories and neo4jTemplate
- Database - Neo4j Standalone server