diff --git a/pcgrr/vignettes/installation.Rmd b/pcgrr/vignettes/installation.Rmd
index faa1a58b..632a7d09 100644
--- a/pcgrr/vignettes/installation.Rmd
+++ b/pcgrr/vignettes/installation.Rmd
@@ -1,13 +1,18 @@
---
title: "Installation"
output: rmarkdown::html_document
+
---
+```{r setup, include=FALSE}
+knitr::opts_chunk$set(comment = "", collapse = TRUE)
+```
+
+
```{r load_pkgs, include=FALSE, echo=FALSE, message=FALSE, warning=FALSE}
require(glue, include.only = "glue")
```
-
```{r vars, echo=FALSE}
Sys.setenv(VEP_VERSION = "112")
Sys.setenv(PCGR_VERSION = "1.4.1.9014")
@@ -18,50 +23,48 @@ BUNDLE_VERSION <- Sys.getenv("BUNDLE_VERSION")
```
```{r funcs, echo=FALSE}
-bundle_link <- function(v, hg) {
- glue("[{hg} - {v}](https://insilico.hpc.uio.no/pcgr/pcgr_ref_data.{v}.{hg}.tgz)")
+bundle_link <- function(hg) {
+ v <- BUNDLE_VERSION
+ glue("https://insilico.hpc.uio.no/pcgr/pcgr_ref_data.{v}.{hg}.tgz")
}
```
-The PCGR workflow has several data requirements and software installation options.
-
-- Data requirements:
- - Sample-specific inputs (e.g. somatic variant calls in VCF format)
- - Reference bundle (e.g. CIViC, CGI, TCGA)
- - Ensembl VEP data cache
+## Data
-- Software options:
- - Conda
- - Docker
- - Singularity/Apptainer
+PCGR requires the following data:
-## Data
+- Sample-specific inputs (e.g. somatic variant calls in VCF format)
+- Reference bundle (e.g. CIViC, CGI, TCGA)
+- Ensembl VEP data cache
-PCGR supports GRCh37 and GRCh38 sample-specific inputs. The reference bundle and
-VEP data cache need to match the chosen human genome assembly.
+PCGR supports the GRCh37 and GRCh38 human genome assemblies. All the data above
+need to match the chosen assembly.
### 1. Reference Bundle
-Reference bundles are generated semi-automatically by the author and versioned
-based on their release date. Keep in mind that the bundles support only certain
-Ensembl VEP versions. The genome-specific bundle is available from below (size: ~5G):
-
-- `r bundle_link(v = BUNDLE_VERSION, hg = "grch37")`
-- `r bundle_link(v = BUNDLE_VERSION, hg = "grch38")`
+Reference bundles are generated semi-automatically (by the PCGR author) and
+are versioned based on their release date. Keep in mind that the bundles support
+only certain Ensembl VEP versions. The genome-specific bundles
+(**v`r BUNDLE_VERSION`**) can be downloaded directly from below (size: ~5G):
-**Tip**: The `data/grch3x/.PCGR_BUNDLE_VERSION` file indicates the bundle version.
+| Assembly | Download Link |
+|----------|---------------------------|
+| GRCh38 | `r bundle_link("grch38")` |
+| GRCh37 | `r bundle_link("grch37")` |
-
-Bash example
+**Tip**: The `data/grch3x/.PCGR_BUNDLE_VERSION` file within the downloaded bundle
+indicates the bundle version for reporting purposes.
+#### Bash Example
+```{bash echo=FALSE}
+echo "BUNDLE_VERSION=\"${BUNDLE_VERSION}\""
+```
-```{bash eval=FALSE}
+```bash
GENOME="grch38" # or "grch37"
-BUNDLE_VERSION="20240612"
BUNDLE="pcgr_ref_data.${BUNDLE_VERSION}.${GENOME}.tgz"
-
wget https://insilico.hpc.uio.no/pcgr/${BUNDLE}
gzip -dc ${BUNDLE} | tar xvf -
@@ -69,13 +72,11 @@ mkdir ${BUNDLE_VERSION}
mv data/ ${BUNDLE_VERSION}
```
-
-
### 2. VEP Cache
-Ensembl [VEP][vep-web] requires a data cache which is available from the Ensembl
+[VEP][vep-web] requires a data cache which is available from the Ensembl
[FTP site][ensembl-ftp] (search there for files starting with `homo_sapiens_vep_`).
-We currently support Ensembl VEP version `112`.
+We currently support Ensembl VEP **v`r VEP_VERSION`**.
**Tip**: PCGR needs to be pointed to the _parent_ directory containing
the downloaded `homo_sapiens/xyz_GRCh3x/` cache, which is usually called `.vep` if
@@ -85,32 +86,80 @@ you've followed the VEP cache [download instructions][vep-cache].
[ensembl-ftp]: https://ftp.ensembl.org/pub/release-112/variation/indexed_vep_cache/
[vep-cache]: https://asia.ensembl.org/info/docs/tools/vep/script/vep_cache.html#cache
-- Bash example:
+#### Bash Example
+
+```{bash echo=FALSE}
+echo "VEP_VERSION=\"${VEP_VERSION}\""
+```
```bash
GENOME="GRCh38" # or "GRCh37"
-VEP_VERSION="112"
CACHE="homo_sapiens_vep_${VEP_VERSION}_${GENOME}.tar.gz"
wget https://ftp.ensembl.org/pub/release-${VEP_VERSION}/variation/indexed_vep_cache/${CACHE}
gzip -dc ${CACHE} | tar xvf -
```
+-----------------------------
+
## Software
-The PCGR workflow can be installed using [Conda][conda-web], [Docker][docker-web],
-or [Singularity/Apptainer][apptainer-web].
+The PCGR workflow can be installed using [Docker][docker-web],
+[Singularity/Apptainer][apptainer-web] or [Conda][conda-web].
[conda-web]: https://conda.io/projects/conda/en/latest/user-guide/getting-started.html
[docker-web]: https://docs.docker.com/
[apptainer-web]: https://apptainer.org/docs/user/latest/index.html
-### Conda
+### A. Docker
+
+The Docker image is available on [DockerHub](https://hub.docker.com/r/sigven/pcgr/tags).
+Pull the latest **v`r PCGR_VERSION`** image with:
+
+```{r echo=FALSE}
+glue("docker pull sigven/pcgr:{PCGR_VERSION}")
+# might need to specify platform
+# docker pull --platform=amd64 sigven/pcgr:${PCGR_VERSION}
+```
+
+#### Example Run
+
+```bash
+docker container run -it --rm \
+ -v /Users/you/projects/.vep:/mnt/.vep
+ -v /Users/you/projects/bundle:/mnt/bundle \
+ -v /Users/you/projects/pcgr_inputs:/mnt/pcgr_inputs \
+ -v /Users/you/projects/pcgr_outputs:/mnt/pcgr_outputs \
+ sigven/pcgr:1.4.1.9014 \
+ pcgr \
+ --input_vcf "/mnt/pcgr_inputs/tumor_sample.BRCA.vcf.gz" \
+ --vep_dir "/mnt/.vep" \
+ --refdata_dir "/mnt/bundle" \
+ --output_dir "/mnt/pcgr_outputs" \
+ --genome_assembly "grch38" \
+ --sample_id "SampleB" \
+ --assay "WGS" \
+ --vcf2maf
+```
+
+### B. Singularity/Apptainer
+
+```{r echo=FALSE}
+glue("apptainer pull oras://ghcr.io/sigven/pcgr:{PCGR_VERSION}.singularity")
+```
+
+
-There is conda support for both Linux and macOS machines:
+### C. Conda
-
-Linux
+There is Conda support for both Linux and macOS machines.
+The following process can take anywhere from 10 up to 40 minutes when installing
+from scratch, mostly depending on the user's and server's internet connection.
+Most of the time is spent on downloading the `{BSgenome.Hsapiens.UCSC.hg19}` and
+`{BSgenome.Hsapiens.UCSC.hg38}` R packages (which happens at the very end of the
+conda environment creation).
+
+#### Linux
```bash
# set up variables
@@ -127,12 +176,9 @@ conda activate ./pcgr_conda/pcgr
pcgr --version
```
-
-
-
-macOS
+#### macOS
-For macOS M1 machines, you need to have `CONDA_SUBDIR=osx-64` before the
+For macOS M1 machines, you need to include `CONDA_SUBDIR=osx-64` before the
`conda create` command - see
:
@@ -150,180 +196,3 @@ conda activate ./pcgr_conda/pcgr
# test that it works
pcgr --version
```
-
-
-
-### Docker
-
-See the [Docker setup](#dockersetup) section for more details.
-
-```bash
-PCGR_VERSION="1.4.1.9014"
-docker pull sigven/pcgr:${PCGR_VERSION}
-# might need to specify platform
-# docker pull --platform=amd64 sigven/pcgr:${PCGR_VERSION}
-```
-
-### Singularity/Apptainer
-
-```bash
-PCGR_VERSION="1.4.1.9014"
-apptainer pull oras://ghcr.io/sigven/pcgr:${PCGR_VERSION}.singularity
-```
-
-
-
-
-
-
-
-
-### STEP 2: Set up Conda or Docker
-
-Step 2 depends on if you want to use Conda or Docker:
-
-- For Conda, continue reading the [PCGR Conda setup](#condasetup).
-- For Docker, skip to the [PCGR Docker setup](#dockersetup).
-
-
-
-### Option 1: Conda
-
-#### a) Miniconda and conda
-
-Download and install the Miniconda installer from :
-
-- Make sure to download the Linux or MacOSX script according to which platform you're currently on.
-- Run `bash miniconda.sh` and follow the prompts (it should be okay to accept the defaults, unless you want to choose a different
- installation location than the default `~/miniconda3`).
-- Exit your current terminal session and open a new one. You should now notice something like a `(base)` string as a
- prefix in your terminal prompt. This means that you're in the `base` conda environment, and you're ready to start
- installing the conda environments for PCGR.
-
-```bash
-PLATFORM="MacOSX" # or "Linux"
-MINICONDA_URL="https://repo.continuum.io/miniconda/Miniconda3-latest-${PLATFORM}-x86_64.sh"
-wget ${MINICONDA_URL} -O miniconda.sh && chmod +x miniconda.sh
-bash miniconda.sh
-```
-
-```text
-# exit terminal and open new one - you should now see:
-
-# as of May 2024
-(base) $ conda --version
-conda 24.5.0
-```
-
-#### b) Create PCGR conda environments
-
-The `conda/env/lock` directory in the PCGR codebase contains two `.lock` files which
-can be used to create the required conda environments for the Python component
-(`pcgr`) and the R components (`pcgrr` (and `cpsr`)). We install the conda
-dependencies for these two environments in a local `conda` directory in the
-following example:
-
-```bash
-cd /Users/you/dir4/conda
-PLATFORM="osx-64" # or "linux-64"
-
-PCGR_VERSION="1.4.1.9014"
-PCGR_REPO="https://raw.githubusercontent.com/sigven/pcgr/v${PCGR_VERSION}/conda/env/lock/"
-PLATFORM="linux" # or "osx"
-
-conda create --prefix ./pcgr --file ${PCGR_REPO}/pcgr-${PLATFORM}-64.lock
-conda create --prefix ./pcgrr --file ${PCGR_REPO}/pcgrr-${PLATFORM}-64.lock
-
-## Alternatively, for installing in your central conda directory, use the following:
-# conda create --name pcgr --file ${PCGR_CONDA_ENV_DIR}/lock/pcgr-${PLATFORM}.lock
-# conda create --name pcgrr --file ${PCGR_CONDA_ENV_DIR}/lock/pcgrr-${PLATFORM}.lock
-
-## For MacOS M1, you need to have 'CONDA_SUBDIR=osx-64' before the conda command, i.e.:
-# CONDA_SUBDIR=osx-64 conda create --prefix [...] --file [...]
-```
-
-The above process takes 20-30 minutes when installing from scratch. Most of the time
-is spent on downloading the
-{BSgenome.Hsapiens.UCSC.hg19} and {BSgenome.Hsapiens.UCSC.hg38} R packages
-(and yes, for simplicity we download both packages).
-In the end, confirm your conda environments have been installed correctly
-(notice how the paths are different to the `base` env installation after using the
-`--prefix` option above):
-
-```text
-$ (base) conda env list
-# conda environments:
-#
-base * /Users/you/miniconda3
-pcgr /Users/you/dir4/conda/pcgr
-pcgrr /Users/you/dir4/conda/pcgrr
-```
-
-#### c) Activate pcgr conda environment
-
-You need to activate the `conda/pcgr` conda environment, and test that it works
-correctly with e.g. `pcgr --version`:
-
-```text
-$ cd /Users/you/dir4/conda
-(base) $ conda activate ./conda/pcgr
-# note how the full path to the locally installed conda environment is now displayed
-
-(/Users/you/dir4/conda) $ which pcgr
-/Users/you/dir4/conda/pcgr/bin/pcgr
-
-(/Users/you/dir4/conda) $ pcgr --version
-pcgr X.X.X
-
-(/Users/you/dir4/conda) $ which pcgrr.R
-/Users/you/dir4/conda/pcgr/bin/pcgrr.R
-```
-
-You should now be all set up to run PCGR! Continue on to [an example run](running.html#example-run).
-
-
-
-### Option 2: Docker
-
-#### a) Install Docker
-
-For installing Docker, follow the instructions at
-for your Linux or MacOSX machine.
-
-#### b) Download PCGR Docker Image
-
-- Pull the [PCGR Docker image](https://hub.docker.com/r/sigven/pcgr/tags) from
- DockerHub with: `docker pull sigven/pcgr:X.X.X`
-
-#### c) Run PCGR Docker Container
-
-If you are familiar with working with Docker volumes ()
-you can run PCGR using Docker instead of conda using the `-v :` Docker option.
-You'll need to map your PCGR inputs to Docker container paths.
-
-For example, say you have the input VCF `sampleX.vcf.gz` stored in the
-directory `/Users/you/project1`. You would need to supply Docker with a
-`--volume` (or `-v`) option mapping the directory of that VCF with
-a directory inside the Docker container, e.g. `/home/input_vcf_dir`.
-That would become: `-v /Users/you/project1:/home/input_vcf_dir`
-(note the `:` separating your directory from the container's directory).
-
-Then your command would look something like this:
-
-```bash
-docker container run -it --rm \
- -v /Users/you/dir0/vep:/root/vep
- -v /Users/you/dir1/data:/root/pcgr_refdata \
- -v /Users/you/dir2/pcgr_inputs:/root/pcgr_inputs \
- -v /Users/you/dir3/pcgr_outputs:/root/pcgr_outputs \
- sigven/pcgr:1.4.1.9014 \
- pcgr \
- --input_vcf "/root/pcgr_inputs/tumor_sample.BRCA.vcf.gz" \
- --vep_dir "/root/vep/.vep" \
- --refdata_dir "/root/pcgr_refdata" \
- --output_dir "/root/pcgr_outputs" \
- --genome_assembly "grch38" \
- --sample_id "SampleB" \
- --assay "WGS" \
- --vcf2maf
-```