You should have installed:
- Docker
- Docker Compose
- MongoDB Database Tools (specifically
mongoimport
to add the dummy data to the database) - Python 3
All of the commands should be executed from the deploy directory.
cd deploy
docker network create my-app-network
docker-compose up -d --build
With mongo-express
we can see the contents of the database at http://localhost:8081.
To load the database we execute the following commands:
docker cp /path/to/analyses.json rimongo:tmp/analyses.json
docker cp /path/to/biosamples.json rimongo:tmp/biosamples.json
docker cp /path/to/cohorts.json rimongo:tmp/cohorts.json
docker cp /path/to/datasets.json rimongo:tmp/datasets.json
docker cp /path/to/genomicVariations.json rimongo:tmp/genomicVariations.json
docker cp /path/to/individuals.json rimongo:tmp/individuals.json
docker cp /path/to/runs.json rimongo:tmp/runs.json
docker exec rimongo mongoimport --jsonArray --uri "mongodb://root:[email protected]:27017/beacon?authSource=admin" --file /tmp/datasets.json --collection datasets
docker exec rimongo mongoimport --jsonArray --uri "mongodb://root:[email protected]:27017/beacon?authSource=admin" --file /tmp/analyses.json --collection analyses
docker exec rimongo mongoimport --jsonArray --uri "mongodb://root:[email protected]:27017/beacon?authSource=admin" --file /tmp/biosamples.json --collection biosamples
docker exec rimongo mongoimport --jsonArray --uri "mongodb://root:[email protected]:27017/beacon?authSource=admin" --file /tmp/cohorts.json --collection cohorts
docker exec rimongo mongoimport --jsonArray --uri "mongodb://root:[email protected]:27017/beacon?authSource=admin" --file /tmp/genomicVariations.json --collection genomicVariations
docker exec rimongo mongoimport --jsonArray --uri "mongodb://root:[email protected]:27017/beacon?authSource=admin" --file /tmp/individuals.json --collection individuals
docker exec rimongo mongoimport --jsonArray --uri "mongodb://root:[email protected]:27017/beacon?authSource=admin" --file /tmp/runs.json --collection runs
This loads the JSON files inside of the data
folder into the MongoDB database container. Each time you import data you will have to create indexes for the queries to run smoothly. Please, check the next point about how to Create the indexes.
Remember to do this step every time you import new data!!
You can create the necessary indexes running the following Python script:
docker exec beacon python beacon/reindex.py
After deploying all the data, you will need to tell the beacon which are the individual and biosample ids belonging to each dataset and cohort. In order to do that, please, add the name of each dataset with the respective array of all the ids together in this file datasets.yml. Then, repeat the same for the cohorts modifying this file cohorts.yml.
This step consists of analyzing all the collections of the Mongo database for first extracting the ontology OBO files and then filling the filtering terms endpoint with the information of the data loaded in the database.
You can automatically fetch the ontologies and extract the filtering terms running the following script:
docker exec beacon python beacon/db/extract_filtering_terms.py
If you have the ontologies loaded and the filtering terms extracted, you can automatically get their descendant and semantic similarity terms running the following script:
docker exec beacon python beacon/db/get_descendants.py
Check the logs until the beacon is ready to be queried:
docker-compose logs -f beacon
You can query the beacon using GET or POST. Below, you can find some examples of usage:
For simplicity (and readability), we will be using HTTPie.
Querying this endpoit it should return the 13 variants of the beacon (paginated):
http GET http://localhost:5050/api/g_variants
You can also add request parameters to the query, like so:
http GET http://localhost:5050/api/individuals?filters=NCIT:C16576,NCIT:C42331
You can use POST to make the previous query. With a request.json
file like this one:
{
"meta": {
"apiVersion": "2.0"
},
"query": {
"requestParameters": {
"alternateBases": "G" ,
"referenceBases": "A" ,
"start": [ 16050074 ],
"end": [ 16050568 ],
"variantType": "SNP"
},
"filters": [],
"includeResultsetResponses": "HIT",
"pagination": {
"skip": 0,
"limit": 10
},
"testMode": false,
"requestedGranularity": "record"
}
}
You can execute:
curl \
-H 'Content-Type: application/json' \
-X POST \
-d '{
"meta": {
"apiVersion": "2.0"
},
"query": {
"requestParameters": {
"alternateBases": "G" ,
"referenceBases": "A" ,
"start": [ 16050074 ],
"end": [ 16050568 ],
"variantType": "SNP"
},
"filters": [],
"includeResultsetResponses": "HIT",
"pagination": {
"skip": 0,
"limit": 10
},
"testMode": false,
"requestedGranularity": "record"
}
}' \
http://localhost:5050/api/g_variants
But you can also use complex filters:
{
"meta": {
"apiVersion": "2.0"
},
"query": {
"filters": [
{
"id": "UBERON:0000178",
"scope": "biosample",
"includeDescendantTerms": false
}
],
"includeResultsetResponses": "HIT",
"pagination": {
"skip": 0,
"limit": 10
},
"testMode": false,
"requestedGranularity": "count"
}
}
You can execute:
http POST http://localhost:5050/api/biosamples --json < request.json
And it will use the ontology filter to filter the results.