From e7656c8490800d6326e636b48a406b2c8ce040d8 Mon Sep 17 00:00:00 2001 From: Lance Parsons Date: Fri, 21 Apr 2023 16:35:49 -0400 Subject: [PATCH] Update README.md to include information on Quarto notebooks --- README.md | 125 +++++------------- ...A.Rproj => pe-small-rna-seq-analysis.Rproj | 0 2 files changed, 31 insertions(+), 94 deletions(-) rename adamson-smallRNA.Rproj => pe-small-rna-seq-analysis.Rproj (100%) diff --git a/README.md b/README.md index 2f02be7..5718cd0 100644 --- a/README.md +++ b/README.md @@ -1,22 +1,22 @@ # Snakemake workflow: smallRNA Pipeline -[![Snakemake](https://img.shields.io/badge/snakemake-≥6.6.1-brightgreen.svg)](https://snakemake.bitbucket.io) -[![Build Status](https://travis-ci.org/troycomi/adamson_smRNA.svg?branch=master)](https://travis-ci.org/snakemake-workflows/adamson_smRNA) +[![Snakemake](https://img.shields.io/badge/snakemake-≥7.9.0-brightgreen.svg)](https://snakemake.github.io/) -Workflow for analysis of smRNA +Workflow for the analysis of small RNA-seq data for Yan et. al, "An endogenous +small RNA-binding protein safeguards prime editing" (in press). -This is the template for a new Snakemake workflow. Replace this text with a -comprehensive description, covering the purpose and domain. +The workflow is written using [Snakemake](https://snakemake.github.io/) and +[Quarto](https://quarto.org/). -Insert your code into the respective folders, i.e. `scripts`, `rules` and -`envs`. Define the entry point of the workflow in the `Snakefile` and the main -configuration in the `config.yaml` file. +Dependencies are installed using [Bioconda](https://bioconda.github.io/) where +possible. -The workflow is written using [Snakemake](https://snakemake.readthedocs.io/). +The workflow consists of two pieces, one written in Snakemake, the other is +composed of Quarto notebooks. -Dependencies are installed using [Bioconda](https://bioconda.github.io/) where possible. +## Snakemake workflow -## Setup environment and run workflow +### Setup environment and run workflow 1. Clone workflow into working directory @@ -47,98 +47,35 @@ Dependencies are installed using [Bioconda](https://bioconda.github.io/) where p source activate ``` -6. Execute workflow +6. Execute main workflow ```bash - snakemake -n + snakemake --cores 1 ``` -7. Investigate results +## Quarto notebooks - After successful execution, you can create a self-contained interactive - HTML report with all results via: - - ```bash - snakemake --report report.html - ``` - - This report can, e.g., be forwarded to your collaborators. - -## Running workflow on `gen-comp1` - -```bash -snakemake --cluster-config cluster_config.cetus.yaml \ - --drmaa " \ - --cpus-per-task={cluster.n} \ - --mem={cluster.memory} \ - --qos={cluster.qos}" \ - --use-conda -w 60 -rp -j 1000 -``` - -## Generating the R report - -Currently, the R report has to be run separately. +The Quarto notebooks utilize [R](https://www.r-project.org/) and are run +separately. 1. Run the workflow as above -2. Load the Rproject `adamson-smallRNA.Rproj` in RStudio - -3. Load `exogenous-rna-profiles.Rmd`, run all cells, and save to create the - `exogenous_targets_paired_read_summary.tsv` as well as the report - `exogenous-rna-profiles.nb.html` - -### R packages - -1. If needed, install the `renv` package - - ```r - install.packages("renv") - ``` - -2. Load required packages - - ```r - renv::activate() - ``` - -## Advanced - -The following recipe provides established best practices for running and -extending this workflow in a reproducible way. - -1. [Fork](https://help.github.com/en/articles/fork-a-repo) the repository to a - personal or lab account. - -2. [Clone](https://help.github.com/en/articles/cloning-a-repository) the fork - to the desired working directory for the concrete project/run on your - machine. - -3. [Create a new - branch](https://git-scm.com/docs/gittutorial#_managing_branches) (the - project-branch) within the clone and switch to it. The branch will contain - any project-specific modifications (e.g. to configuration, but also to - code). - -4. Modify the config, and any necessary sheets (and probably the workflow) as - needed. - -5. Commit any changes and push the project-branch to your fork on GitHub. - -6. Run the analysis. - -7. Optional: Merge back any valuable and generalizable changes to the [upstream - repository](https://github.com/troycomi/adamson_smRNA) via a [**pull - request**](https://help.github.com/en/articles/creating-a-pull-request). - This would be **greatly appreciated**. - -8. Optional: Push results (plots/tables) to the remote branch on your fork. +2. Load the Rproject `pe-small-rna-seq-analysis.Rproj` in RStudio. -9. Optional: Create a self-contained workflow archive for publication along - with the paper (snakemake --archive). +3. This project uses + [`renv`](https://rstudio.github.io/renv/articles/renv.html) to keep track of + installed packages. Install `renv` if not installed and load dependencies + with `renv::restore()`. -10. Optional: Delete the local clone/workdir to free space. +4. Load one of the quarto notebooks below and notebook and run all of the cells + or use the "Render" button in RStuido. -## Testing + * `biotype-comparison.qmd` + * `fragment-size-distributions.qmd` + * `alignment_statistics.qmd` + * `coverage-plots.qmd` -Tests cases are in the subfolder `.test`. They should be executed via -continuous integration with Travis CI. +5. Some of the notebooks use parameters to generate a few different versions of + the plots. If Quarto and all of the required R packages are installed, you + can use the `render_quarto_reports.sh` script to render all of the quarto + notebooks. diff --git a/adamson-smallRNA.Rproj b/pe-small-rna-seq-analysis.Rproj similarity index 100% rename from adamson-smallRNA.Rproj rename to pe-small-rna-seq-analysis.Rproj