Skip to content

Current tracks

Karin Lagesen edited this page Oct 22, 2023 · 5 revisions

The pipeline has been developed as a series of scripts, where each script has a specific input and a set of logically connected analyses. Each script comes with its own nextflow script and a separate config file, which is used to specify inputs and software options for that specific run.

The current pipeline contains the following scripts:

  • qc_script.nf: Quality control
    • Fastqc is run on all input files, before multiqc is run to create a summary.
  • specific_gene.nf: MLST, virulence and AMR annotation
    • The software ARIBA is used to annotate MLST, virulence and AMR directly from reads. The user must specify species, which AMR and virulence database to use, and in the case of E. coli, which of the two available MLST schemas to use. A user can choose to run all three analyses, or only one or two of them.
  • asm_annot.nf: Assembly and annotation
    • This script first runs through fastqc and multiqc, before stripping PhiX using bbduk, trimming with trimmomatic, assembly with SPAdes, assembly polishing with pilon, before evaluating assemblies with QUAST. Common options to each of these programs can be specified in the config file for that script.
Clone this wiki locally