Skip to content

Commit

Permalink
Merge branch 'development'
Browse files Browse the repository at this point in the history
# Conflicts:
#	README.md
#	databases/geneset_generation.R
#	setup.sh
  • Loading branch information
metzgerpatrick committed Jul 10, 2020
2 parents a80449b + 5df0350 commit 183fe76
Show file tree
Hide file tree
Showing 3 changed files with 18 additions and 7 deletions.
6 changes: 3 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ Prior running the setup script, some components need to be installed manually:

- databases
- [hallmark gene-sets](http://software.broadinstitute.org/gsea/msigdb/)
- h.all.v7.1.entrez.gmt, (any other version should work as well if the filename is adjusted in [geneset_generation.R](https://github.com/AG-Boerries/MIRACUM-Pipe-docker/blob/master/databases/geneset_generation.R) in line 2
- h.all.vX.X.entrez.gmt (current release is v7.1 (June 2020))
- [condel score](https://bbglab.irbbarcelona.org/fannsdb/)
- fannsdb.tsv.gz
- fannsdb.tsv.gz.tbi
Expand All @@ -34,7 +34,7 @@ To install the databases follow the links, register and download the listed file
Next, run the setup script. We recommend to install everything, which does **not** include the example and reference data. There are also options to install and setup parts:

```bash
./setup.sh -t all
./setup.sh -t all -m h.all.v.7.1.entrez.gmt
```

See `setup.sh -h` to list the available options. By default, we do not install the reference genome as well as our example. If you want to install it run
Expand All @@ -51,7 +51,7 @@ See `setup.sh -h` to list the available options. By default, we do not install t
- create a database for the latest COSMIC release (according to the [annovar manual](http://annovar.openbioinformatics.org/en/latest/user-guide/filter/#cosmic-annotations))
- Download [prepare_annovar_user.pl](http://www.openbioinformatics.org/annovar/download/prepare_annovar_user.pl) and add to annovar folder
- register at [COSMIC](https://cancer.sanger.ac.uk/cosmic);
- Download the latest release for GRCh37 (as of October 2019 the latest release is v90):
- Download the latest release for GRCh37 (as of June 2020 the latest release is v91):
- VCF/CosmicCodingMuts.vcf.gz
- VCF/CosmicNonCodingVariants.vcf.gz
- CosmicMutantExport.tsv.gz
Expand Down
5 changes: 4 additions & 1 deletion databases/geneset_generation.R
Original file line number Diff line number Diff line change
@@ -1,5 +1,8 @@
library(GSA)
gmt <- GSA.read.gmt('h.all.v7.1.entrez.gmt')

args <- commandArgs(trailingOnly = TRUE)
hallmarks <- args[1]
gmt <- GSA.read.gmt(hallmarks)
genesets <- gmt$genesets
names <- data.frame(Names = gmt$geneset.names, Descriptions = gmt$geneset.descriptions)
names(genesets) <- names$Names
Expand Down
14 changes: 11 additions & 3 deletions setup.sh
Original file line number Diff line number Diff line change
Expand Up @@ -13,13 +13,15 @@ function usage() {
echo "usage: setup -t task"
echo " -t task specify task: $(join_by ' ' ${VALID_TASKS})"
echo " -h show this help screen"
echo " -m file MSigDB hallmarks gene-set h.all.vX.X.entrez.gmt file"
exit 1
}

while getopts d:t:ph option; do
while getopts d:t:m:ph option; do
case "${option}" in
d) readonly PARAM_DIR_PATIENT=$OPTARG ;;
t) PARAM_TASK=$OPTARG ;;
m) readonly HALLMARKS=$OPTARG ;;
h) usage ;;
\?)
echo "Unknown option: -$OPTARG" >&2
Expand Down Expand Up @@ -214,6 +216,12 @@ function install_databases() {
function setup_databases() {
echo "setup databases"

if [[ -z "${HALLMARKS}" ]]; then
echo "no hallmarks file provided! Please provide the h.all.vX.X.entrez.gmt file from MSigDB."
exit 1
fi
echo "${HALLMARK}"

cd "${DIR_DATABASES}" || exit 1

BIN_RSCRIPT=$(which Rscript)
Expand All @@ -223,9 +231,9 @@ function setup_databases() {
fi

## R Code for processing
${BIN_RSCRIPT} --vanilla geneset_generation.R
${BIN_RSCRIPT} --vanilla geneset_generation.R "${HALLMARKS}"

rm -f h.all.v7.1.entrez.gmt
#rm -f "${HALLMARKS}"

echo "done"
}
Expand Down

0 comments on commit 183fe76

Please sign in to comment.