nf-core/finalproject is a bioinformatics pipeline that aims to analyze RNA-Seq data when there is a necessity for a filtering step. It is useful when the RNA-Seq experiment was performed in cell culture or by another design where there is a high probability of contamination of the target transcriptome with the host transcriptome.
- Read QC (
FastQC
) - Align reads against the host organism genome and select reads that do not align (
Bowtie2
) - Align reads against the target organism genome.
- Present QC for raw reads and the alignments (
MultiQC
)
Note If you are new to Nextflow and nf-core, please refer to this page on how to set-up Nextflow. Make sure to test your setup with
-profile test
before running the workflow on actual data.
First, prepare a samplesheet with your input data that looks as follows:
samplesheet.csv
:
sample,fastq_1,fastq_2
CONTROL_REP1,AEG588A1_S1_L002_R1_001.fastq.gz,AEG588A1_S1_L002_R2_001.fastq.gz
Each row represents a fastq file (single-end) or a pair of fastq files (paired end).
Now, you can run the pipeline using:
nextflow run iaradsouza1/finalproject \
-profile <docker/singularity/.../institute> \
--input samplesheet.csv \
--outdir <OUTDIR> \
-revision master
Warning: Please provide pipeline parameters via the CLI or Nextflow
-params-file
option. Custom config files including those provided by the-c
Nextflow option can be used to provide any configuration except for parameters; see docs.
For more details, please refer to the usage documentation and the parameter documentation.
To see the results of a test run with a full size dataset refer to the results tab on the nf-core website pipeline page. For more details about the output files and reports, please refer to the output documentation.
nf-core/finalproject was originally written by Iara Souza.
If you would like to contribute to this pipeline, please see the contributing guidelines.
You can cite the nf-core
publication as follows:
The nf-core framework for community-curated bioinformatics pipelines.
Philip Ewels, Alexander Peltzer, Sven Fillinger, Harshil Patel, Johannes Alneberg, Andreas Wilm, Maxime Ulysse Garcia, Paolo Di Tommaso & Sven Nahnsen.
Nat Biotechnol. 2020 Feb 13. doi: 10.1038/s41587-020-0439-x.