Merge pull request #1 from Princeton-LSI-ResearchComputing/update-readme

Update README.md to include information on Quarto notebooks
Princeton-LSI-ResearchComputing · Apr 21, 2023 · 6588a2f · 6588a2f
2 parents 938d481 + e7656c8
commit 6588a2f
Show file tree

Hide file tree

Showing 2 changed files with 31 additions and 94 deletions.
diff --git a/README.md b/README.md
@@ -1,22 +1,22 @@
 # Snakemake workflow: smallRNA Pipeline
 
-[![Snakemake](https://img.shields.io/badge/snakemake-≥6.6.1-brightgreen.svg)](https://snakemake.bitbucket.io)
-[![Build Status](https://travis-ci.org/troycomi/adamson_smRNA.svg?branch=master)](https://travis-ci.org/snakemake-workflows/adamson_smRNA)
+[![Snakemake](https://img.shields.io/badge/snakemake-≥7.9.0-brightgreen.svg)](https://snakemake.github.io/)
 
-Workflow for analysis of smRNA
+Workflow for the analysis of small RNA-seq data for Yan et. al, "An endogenous
+small RNA-binding protein safeguards prime editing" (in press).
 
-This is the template for a new Snakemake workflow. Replace this text with a
-comprehensive description, covering the purpose and domain.
+The workflow is written using [Snakemake](https://snakemake.github.io/) and
+[Quarto](https://quarto.org/).
 
-Insert your code into the respective folders, i.e. `scripts`, `rules` and
-`envs`. Define the entry point of the workflow in the `Snakefile` and the main
-configuration in the `config.yaml` file.
+Dependencies are installed using [Bioconda](https://bioconda.github.io/) where
+possible.
 
-The workflow is written using [Snakemake](https://snakemake.readthedocs.io/).
+The workflow consists of two pieces, one written in Snakemake, the other is
+composed of Quarto notebooks.
 
-Dependencies are installed using [Bioconda](https://bioconda.github.io/) where possible.
+## Snakemake workflow
 
-## Setup environment and run workflow
+### Setup environment and run workflow
 
 1. Clone workflow into working directory
 
@@ -47,98 +47,35 @@ Dependencies are installed using [Bioconda](https://bioconda.github.io/) where p
     source activate <project>
     ```
 
-6. Execute workflow
+6. Execute main workflow
 
     ```bash
-    snakemake -n
+    snakemake --cores 1
     ```
 
-7. Investigate results
+## Quarto notebooks
 
-    After successful execution, you can create a self-contained interactive
-    HTML report with all results via:
-
-    ```bash
-    snakemake --report report.html
-    ```
-
-    This report can, e.g., be forwarded to your collaborators.
-
-## Running workflow on `gen-comp1`
-
-```bash
-snakemake --cluster-config cluster_config.cetus.yaml \
-    --drmaa " \
-    --cpus-per-task={cluster.n} \
-    --mem={cluster.memory} \
-    --qos={cluster.qos}" \
-    --use-conda -w 60 -rp -j 1000
-```
-
-## Generating the R report
-
-Currently, the R report has to be run separately.
+The Quarto notebooks utilize [R](https://www.r-project.org/) and are run
+separately.
 
 1. Run the workflow as above
 
-2. Load the Rproject `adamson-smallRNA.Rproj` in RStudio
-
-3. Load `exogenous-rna-profiles.Rmd`, run all cells, and save to create the
-   `exogenous_targets_paired_read_summary.tsv` as well as the report
-   `exogenous-rna-profiles.nb.html`
-
-### R packages
-
-1. If needed, install the `renv` package
-
-    ```r
-    install.packages("renv")
-    ```
-
-2. Load required packages
-
-    ```r
-    renv::activate()
-    ```
-
-## Advanced
-
-The following recipe provides established best practices for running and
-extending this workflow in a reproducible way.
-
-1. [Fork](https://help.github.com/en/articles/fork-a-repo) the repository to a
-   personal or lab account.
-
-2. [Clone](https://help.github.com/en/articles/cloning-a-repository) the fork
-   to the desired working directory for the concrete project/run on your
-   machine.
-
-3. [Create a new
-   branch](https://git-scm.com/docs/gittutorial#_managing_branches) (the
-   project-branch) within the clone and switch to it. The branch will contain
-   any project-specific modifications (e.g. to configuration, but also to
-   code).
-
-4. Modify the config, and any necessary sheets (and probably the workflow) as
-   needed.
-
-5. Commit any changes and push the project-branch to your fork on GitHub.
-
-6. Run the analysis.
-
-7. Optional: Merge back any valuable and generalizable changes to the [upstream
-   repository](https://github.com/troycomi/adamson_smRNA) via a [**pull
-   request**](https://help.github.com/en/articles/creating-a-pull-request).
-   This would be **greatly appreciated**.
-
-8. Optional: Push results (plots/tables) to the remote branch on your fork.
+2. Load the Rproject `pe-small-rna-seq-analysis.Rproj` in RStudio.
 
-9. Optional: Create a self-contained workflow archive for publication along
-   with the paper (snakemake --archive).
+3. This project uses
+   [`renv`](https://rstudio.github.io/renv/articles/renv.html) to keep track of
+   installed packages. Install `renv` if not installed and load dependencies
+   with `renv::restore()`.
 
-10. Optional: Delete the local clone/workdir to free space.
+4. Load one of the quarto notebooks below and notebook and run all of the cells
+   or use the "Render" button in RStuido.
 
-## Testing
+   * `biotype-comparison.qmd`
+   * `fragment-size-distributions.qmd`
+   * `alignment_statistics.qmd`
+   * `coverage-plots.qmd`
 
-Tests cases are in the subfolder `.test`. They should be executed via
-continuous integration with Travis CI.
+5. Some of the notebooks use parameters to generate a few different versions of
+   the plots. If Quarto and all of the required R packages are installed, you
+   can use the `render_quarto_reports.sh` script to render all of the quarto
+   notebooks.
diff --git a/adamson-smallRNA.Rproj → pe-small-rna-seq-analysis.Rproj b/adamson-smallRNA.Rproj → pe-small-rna-seq-analysis.Rproj