A script to create a gene-focussed BrigdeDb database based on Ensembl BioMART.
Java 11 is required.
Compile the code with:
mvn clean install
cp target/org.bridgedb.genedb-jar-with-dependencies.jar BioMart2BridgeDb.jar
In your terminal:
java -jar BioMart2BridgeDb.jar <configFile> <outputPath> <oldDB> <inclusive>
-
<configFile>: location of configuration file
-
<outputPath>: Path for the new database
-
<oldDB>: (optional) directory of the old database - run QC
-
<inclusive>: (optional) use inclusive BridgeDb list
Configuration files can be found in https://github.com/bridgedb/create-bridgedb-genedb-config/tree/master/configFiles.
Example: Arabidopsis thaliana config file
-
Give the version of Ensembl BioMart to query:
e.g: http://www.ensembl.org/biomart/, http://oct2014.archive.ensembl.org/biomart/, http://nov2020-metazoa.ensembl.org/biomart/
endpoint=https://nov2020-plants.ensembl.org/biomart/
You can find an overview of releases in the Ensembl Archive, Metazoa Archive, Plants Archive, Fungi Archive.
-
MartRegistry for plants v49 can be found here:
https://nov2020-plants.ensembl.org/biomart/martservice?type=registry
e.g: plants_mart, metazoa_mart, default
schema=plants_mart
-
Code name of the animal species: http://www.ensembl.org/biomart/martservice?type=datasets&mart=ENSEMBL_MART_ENSEMBL, Metazoa v49, Plants v49 and, Fungi v49
species=athaliana_eg_gene
-
The name of the bridge database
database_name=Arabidopsis thaliana genes and proteins
-
The name of the file .bridge created
file_name=At_Derby_Ensembl_Plant_49
-
The different data source code name for Arabidopsis thaliana can be found here:
probe_datasource=Affy,Agilent probe_set=affy_aragene,affy_ath1_121501,agilent_g2519f_015059,agilent_g2519f_021169,agilent_g4136a_011839,agilent_g4136b_013324,agilent_g4142a_012600 gene_datasource=entrezgene_id,go_id,mirbase_accession,mirbase_id,pdb,refseq_dna,refseq_peptide,uniprotsptrembl,uniprotswissprot,tair_locus,nasc_gene_id
-
Optional filters (chromosome list) for Arabidopsis thaliana can be found here: https://nov2020-plants.ensembl.org/biomart/martservice?type=filters&dataset=athaliana_eg_gene
e.g: chromosome_name=1,2,3,4,5,Pt,Mt