EasySeq RC-PCR SARS-CoV-2/COVID-19 WGS kit

Variant pipeline V0.9

Use with V3 and V4 of the WGS kit, else see general info

General info

This github repository contains an automated pipeline dedicated to properly analyse the EasySeq SARS-CoV-2 (COVID-19) sequence sequencing data. Validated with 150/151 bp paired-end reads.

Advice is to redownload the conda.tar.gz after each update to be sure that all conda environments are set in place.

v0.9 release (use with version 4.2 of EasySeq RC-PCR SARS-CoV-2 WGS kit)

Fix for incorrect trimming of primer which result in an incorrect BA.2 ORF1A:I3758V mutation
New primer files added
option using primerVersion to change version of primers

v0.8.1 release (use with version 3 of EasySeq RC-PCR SARS-CoV-2 WGS kit)

This version includes final fix for HV69-70 region
non-covered region and indel in consensus solved (bcftools 1.12)
New nomenclature of SARS-CoV-2 through using new pangolin version is implemented.

version 3 of the EasySeq RC-PCR SARS-CoV-2 WGS kit

Use code version v0.7.0 or later
Implemented lofreq for variant calling which gives much more accurate calls in the report. Consensus output is mostly unaffected.

version 1 or 2 of the EasySeq RC-PCR SARS-CoV-2 WGS kit

Use version v0.5.2 of the github https://github.com/JordyCoolen/easyseq_covid19/releases/tag/v0.5.2

In short:

Automated pipeline to analyse Illumina EasySeq COVID-19 samples to a variant report
The pipeline cleans the Illumina sequencing data
Uses the SARS-CoV-2 reference genome (NC_045512.2)
Custom EasySeq Primer filtering and correction
Mutations and deletions are measured
Fasta consensus of the sample is created
Lineage is determined
Output is available in a structured way
Full QC reports are created
PDF and HTML report as output

INSTALL

install docker on your OS
docker pull jonovox/easyseq_covid19:latest
download the newest release of the pipeline via https://github.com/JordyCoolen/easyseq_covid19/releases
extract the source code
go into the extracted/project folder
download conda environments via: https://surfdrive.surf.nl/files/index.php/s/ggoLXzMoa5iSZYa
extract conda.tar.gz into the project folder created at step 5
Proceed to RUN examples

RUN

RUN_option1

now you have to perform the test to set everything in place
first time running the variant pipeline will deploy more conda environments needed to successfully install the pipeline. This can take a while.
open docker runtime container from image with write rights

sh docker/run.sh covid jonovox/easyseq_covid19:latest

run the test sample inside the container

nextflow run COVID.nf --sampleName test -resume --outDir /workflow/output/test --reads "/workflow/input/test_OUT01_R{1,2}.fastq.gz"

RUN_option2

you can also execute multiple samples in non-parallel way

bash scripts/run_batch.sh <path to folders containing the fastq.gz file> <extension of files> <threads> jonovox/easyseq_covid19:latest

OUTPUT

/workflow/output/test/
├── QC
│   ├── multiqc_data
│   │   ├── multiqc.log
│   │   ├── multiqc_data.json
│   │   ├── multiqc_fastp.txt
│   │   ├── multiqc_general_stats.txt
│   │   ├── multiqc_snpeff.txt
│   │   └── multiqc_sources.txt
│   ├── multiqc_report.html
│   ├── stats.txt
│   ├── test.fastp.json
│   ├── test.mosdepth.global.dist.txt
│   ├── test.mosdepth.summary.txt
│   ├── test.per-base.bed.gz
│   ├── test.per-base.bed.gz.csi
│   └── test_snpEff.csv
├── annotation
│   ├── snpEff_summary.html
│   ├── test_annot_table.txt
│   ├── test_snpEff.csv
│   └── test_snpEff.genes.txt
├── lineage
│   └── lineage_report.csv
├── mapping
│   ├── test.bam
│   ├── test.bam.bai
│   ├── test.final.bam
│   └── test.final.bam.bai
├── rawvcf
│   └── test.raw.vcf
├── report
│   ├── parameters.txt
│   ├── test.fasta
│   ├── test.html
│   └── test.pdf
├── uncovered
│   ├── test_noncov.bed
│   └── test_ubiq.bed
└── vcf
    ├── notpassed
    │   └── test.notpassed.vcf
    ├── test.final.vcf
    ├── test.final.vcf.gz
    ├── test.final.vcf.gz.csi
    └── test.variants.vcf

FLOW-DIAGRAM

TOOLS

nextflow
python
conda/bioconda
fastp
BWA MEM
samtools
bcftools
lofreq
mosdepth
bedtools
snpEff
multiQC
pangolin v3.0.5 (pangoLEARN 2021-06-05) (default in conda.tar.gz)

PANGOLIN

to update the pangolin tool and database perform following commands

sh docker/run.sh covid jonovox/easyseq_covid19:latest
conda activate /workflow/conda/env-pangolin
pangolin --update

DOCKER

build your own docker image

cd easyseq_covid19
docker build --rm -t <image name> ./

SINGULARITY

build SINGULARITY IMAGE from dockerhub

singularity build <imagename>.simg docker://jonovox/easyseq_covid19:latest

CONTRIBUTORS

Department of Medical Microbiology and Radboudumc Center for Infectious Diseases, Radboud university medical center, Nijmegen, The Netherlands

J.P.M. Coolen ([email protected])

NimaGen B.V., Nijmegen, The Netherlands

R.A. Lammerts (NimaGen B.V., Nijmegen, The Netherlands)
J.T. Vonk (Student HAN Bioinformatics, Nijmegen, The Netherlands)

REMARKS

spike S
21765-21770 HV 69-70 deletion

Version 1 and 2 of the EasySeq RC-PCR SARS-CoV-2 WGS kit are not completly overlapping the region 21765-21770 / HV 69-70.
If you use these versions of the WGS kit please use:

variant pipeline v0.5.2
https://github.com/JordyCoolen/easyseq_covid19/releases/tag/v0.5.2

---->         This version solves the not overlapping region of 21765-21770 by using a template based strategy using KMA.
      <---    This method measures which template matches best. Either Wildtype (NC_045512.2) or
              a variant containing the 21765-21770 / HV 69-70 deletion. The result of this strategy
              is projected in the VCF to ensure correct output. This works perfect for now because no other deletions are
              known on this exact location.

variant pipeline v0.7.0 or later

  ---->     In Version 3 of the EasySeq RC-PCR SARS-CoV-2 WGS kit the region 21765-21770 / HV 69-70 region is           
    <-  --   complety overlapping by having a new primer design. This version of the variant pipeline handles the 
            data obtained using version 3 correctly.

REFERENCE

For citing this work please cite:

Coolen, J. P., Wolters, F., Tostmann, A., van Groningen, L. F., Bleeker-Rovers, C. P., Tan, E. C., ... & Melchers, W. J. (2021). SARS-CoV-2 whole-genome sequencing using reverse complement PCR: For easy, fast and accurate outbreak and variant analysis. Journal of Clinical Virology, 144, 104993. https://doi.org/10.1016/j.jcv.2021.104993

Also cite the other programs used, see list of used tools

DISCLAIMER

This is for Research Only. The code and pipeline is continuously under development. We cannot guarantee a full error free result. Especially with the fast developments in SARS-CoV-2/COVID-19 sequencing and the continuously mutating nature of the virus.

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
.idea		.idea
db		db
docker		docker
input		input
scripts		scripts
.dockerignore		.dockerignore
.gitignore		.gitignore
COVID.nf		COVID.nf
Dockerfile		Dockerfile
README.md		README.md
flowchart.png		flowchart.png
nextflow.config		nextflow.config

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

EasySeq RC-PCR SARS-CoV-2/COVID-19 WGS kit

Variant pipeline V0.9

Use with V3 and V4 of the WGS kit, else see general info

Table of contents

General info

v0.9 release (use with version 4.2 of EasySeq RC-PCR SARS-CoV-2 WGS kit)

v0.8.1 release (use with version 3 of EasySeq RC-PCR SARS-CoV-2 WGS kit)

version 3 of the EasySeq RC-PCR SARS-CoV-2 WGS kit

version 1 or 2 of the EasySeq RC-PCR SARS-CoV-2 WGS kit

INSTALL

RUN

RUN_option1

RUN_option2

OUTPUT

FLOW-DIAGRAM

TOOLS

PANGOLIN

to update the pangolin tool and database perform following commands

DOCKER

build your own docker image

SINGULARITY

build SINGULARITY IMAGE from dockerhub

CONTRIBUTORS

REMARKS

REFERENCE

DISCLAIMER

LICENSE

About

Releases

Packages

Languages

yezi0721/easyseq_covid19

Folders and files

Latest commit

History

Repository files navigation

EasySeq RC-PCR SARS-CoV-2/COVID-19 WGS kit

Variant pipeline V0.9

Use with V3 and V4 of the WGS kit, else see general info

Table of contents

General info

v0.9 release (use with version 4.2 of EasySeq RC-PCR SARS-CoV-2 WGS kit)

v0.8.1 release (use with version 3 of EasySeq RC-PCR SARS-CoV-2 WGS kit)

version 3 of the EasySeq RC-PCR SARS-CoV-2 WGS kit

version 1 or 2 of the EasySeq RC-PCR SARS-CoV-2 WGS kit

INSTALL

RUN

RUN_option1

RUN_option2

OUTPUT

FLOW-DIAGRAM

TOOLS

PANGOLIN

to update the pangolin tool and database perform following commands

DOCKER

build your own docker image

SINGULARITY

build SINGULARITY IMAGE from dockerhub

CONTRIBUTORS

REMARKS

REFERENCE

DISCLAIMER

LICENSE

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages