mRNA-Seq Workflow

Workflow for the analysis of mRNA-seq data for Yan et. al, "An endogenous small RNA-binding protein safeguards prime editing" (in press).

The workflow is written using Snakemake and Quarto.

Dependencies are installed using Bioconda where possible.

The workflow consists of two pieces, one written in Snakemake, the other is composed of Quarto notebooks.

Running the Snakemake workflow

Here, we create two workflows to work with the two subsets of data separately: HEK3_1TtoA and PRNP_6GtoT. Run the workflow in each directory separately.

The workflows use the publicly available rna-seq-star-deseq2 workflow. Citation: https://doi.org/10.5281/zenodo.4737358

Clone workflow into working directory
```
git clone <repository> <dir>
cd <dir>
```
Download input data (or skip and use demo-data)

Copy the fastq files into data directory

Edit the configuration as needed (not needed if using demo-data)

# Edit location of fastq files
nano HEK3_1TtoA/config/units.yaml
nano PRNP_6GtoT/config/units.yaml

# Generally, these can remain unchanged 
nano HEK3_1TtoA/config/samples.yaml
nano PRNP_6GtoT/config/samples.yaml
nano HEK3_1TtoA/config/config.yaml
nano PRNP_6GtoT/config/config.yaml

Install Snakemake and Snakedeploy

mamba create -c conda-forge -c bioconda --name snakemake snakemake snakedeploy mamba activate snakemake
Setup workflow specific resources
1. Modify workflow/profiles/default/config.yaml to ensure rules have the required resources to run on the cluster.
Run the workflow (using cluster options is recommended)

snakemake --use-conda -cores 1
Generate a report

snakemake --report report.zip

Quarto notebooks

The Quarto notebooks utilize R and are run separately.

Run the workflows as above
Load the Rproject ./pe-mrna-seq-diffexp.Rproj in RStudio.
This project uses renv to keep track of installed packages. Install renv if not installed and load dependencies with renv::restore().
Load the quarto notebook ./mrna-seq-venn-diag.qmd and run all of the cells or use the "Render" button in RStuido.

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
HEK3_1TtoA		HEK3_1TtoA
PRNP_6GtoT		PRNP_6GtoT
demo-data		demo-data
renv		renv
.Rprofile		.Rprofile
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
mrna-seq-venn-diag.qmd		mrna-seq-venn-diag.qmd
package_figures.sh		package_figures.sh
package_results.sh		package_results.sh
pe-mrna-seq-diffexp.Rproj		pe-mrna-seq-diffexp.Rproj
renv.lock		renv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

mRNA-Seq Workflow

Running the Snakemake workflow

Quarto notebooks

About

Releases 2

Packages

Languages

License

Princeton-LSI-ResearchComputing/PE-mRNA-seq-diffexp

Folders and files

Latest commit

History

Repository files navigation

mRNA-Seq Workflow

Running the Snakemake workflow

Quarto notebooks

About

Resources

License

Stars

Watchers

Forks

Releases 2

Packages 0

Languages

Packages