Skip to content

Commit

Permalink
Merge pull request #8 from nf-core/dev
Browse files Browse the repository at this point in the history
Release 1.0.0
  • Loading branch information
charles-plessy authored Aug 27, 2024
2 parents e0cbc98 + c493be3 commit 95a1eff
Show file tree
Hide file tree
Showing 25 changed files with 1,789 additions and 680 deletions.
10 changes: 1 addition & 9 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,14 +3,6 @@
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/)
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## v1.0dev - [date]
## v1.0.0 "Sweet potato" - [August 27th, 2024]

Initial release of nf-core/pairgenomealign, created with the [nf-core](https://nf-co.re/) template.

### `Added`

### `Fixed`

### `Dependencies`

### `Deprecated`
4 changes: 4 additions & 0 deletions CITATIONS.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,10 @@

> Di Tommaso P, Chatzou M, Floden EW, Barja PP, Palumbo E, Notredame C. Nextflow enables reproducible computational workflows. Nat Biotechnol. 2017 Apr 11;35(4):316-319. doi: 10.1038/nbt.3820. PubMed PMID: 28398311.
## Pipeline design

> Charles Plessy, Michael J. Mansfield, Aleksandra Bliznina, Aki Masunaga, Charlotte West, Yongkai Tan, Andrew W. Liu, Jan Grašič, María Sara del Río Pisula, Gaspar Sánchez-Serna, Marc Fabrega-Torrus, Alfonso Ferrández-Roldán, Vittoria Roncalli, Pavla Navratilova, Eric M. Thompson, Takeshi Onuma, Hiroki Nishida, Cristian Cañestro, Nicholas M. Luscombe. Extreme genome scrambling in marine planktonic Oikopleura dioica cryptic species. Genome Res. 2024. 34: 426-440; doi: [10.1101/2023.05.09.539028](https://doi.org/10.1101/gr.278295.123). PubMed ID: [38621828](https://pubmed.ncbi.nlm.nih.gov/38621828/)
## Pipeline tools

- [LAST](https://gitlab.com/mcfrith/last/)
Expand Down
15 changes: 7 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,14 +21,9 @@

**nf-core/pairgenomealign** is a bioinformatics pipeline that aligns one or more _query_ genomes to a _target_ genome, and plots pairwise representations.

<img src= "assets/tube_map.svg">
![Tubemap workflow summary](docs/images/pairgenomealign-tubemap.png "Tubemap workflow summary")

The pipeline can generate four kinds of outputs, depending on whether sequences of one genome can match the other genome multiple times or not.

- _**many-to-many**_ (M2M): Every computed alignments between the _target_ and a _query_ genome.
- _**many-to-one**_ (M2O): Alignments where regions of the _target_ genome are matched at most once by a _query_ genome.
- _**one-to-many**_ (M2O): Alignments where regions of a _query_ genome are matched at most once by the _target_ genome.
- _**one-to-one**_ (O2O) Alignment where regions of the _target_ and _query_ genomes are used at most once.
The pipeline can generate four kinds of outputs, called _many-to-many_, _many-to-one_, _one-to-many_ and _one-to-one_, depending on whether sequences of one genome are allowed match the other genome multiple times or not.

These alignments are output in [MAF](https://genome.ucsc.edu/FAQ/FAQformat.html#format5) format, and optional line plot representations are output in PNG format.

Expand Down Expand Up @@ -77,7 +72,11 @@ For more details about the output files and reports, please refer to the

We thank the following people for their extensive assistance in the development of this pipeline:

- [Mahdi Mohammed](https://github.com/U13bs1125): ported the original pipeline to _nf-core_ template 2.14.x.
- [Mahdi Mohammed](https://github.com/U13bs1125) ported the original pipeline to _nf-core_ template 2.14.x.
- [Martin Frith](https://github.com/mcfrith/), the author of LAST, gave us extensive feedback and advices.
- [Michael Mansfield](https://github.com/mjmansfi) tested the pipeline and provided critical comments.
- [Aleksandra Bliznina](https://github.com/aleksandrabliznina) contributed to the creation of the initial `last/*` modules.
- [Jiashun Miao](https://github.com/miaojiashun) and [Huyen Pham](https://github.com/ngochuyenpham) tested the pipeline on vertebrate genomes.

## Contributions and Support

Expand Down
33 changes: 31 additions & 2 deletions assets/multiqc_config.yml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
report_comment: >
This report has been generated by the <a href="https://github.com/nf-core/pairgenomealign/tree/dev" target="_blank">nf-core/pairgenomealign</a>
This report has been generated by the <a href="https://github.com/nf-core/pairgenomealign/releases/tag/1.0.0" target="_blank">nf-core/pairgenomealign</a>
analysis pipeline. For information about how to interpret these results, please see the
<a href="https://nf-co.re/pairgenomealign/dev/docs/output" target="_blank">documentation</a>.
<a href="https://nf-co.re/pairgenomealign/1.0.0/docs/output" target="_blank">documentation</a>.
report_section_order:
"nf-core-pairgenomealign-methods-description":
order: -1000
Expand All @@ -19,10 +19,39 @@ custom_data:
file_format: "tsv"
section_name: "Training parameter statistics"
plot_type: "table"
headers:
id:
title: "ID"
description: "target___query"
substitution_percent_identity:
title: "Substitution Percent Identity"
"last -t":
title: "Temperature"
description: "Parameter for converting between scores and probability ratios. This affects the column ambiguity estimates. A score is converted to a probability ratio by this formula: exp(score / TEMPERATURE). The default value is 1/lambda, where lambda is the scale factor of the scoring matrix, which is calculated by the method of Yu and Altschul (YK Yu et al. 2003, PNAS 100(26):15688-93)."
"last -a":
title: "Gap existence"
description: "Gap existence cost (lastal -a)"
"last -b":
title: "Gap extension"
description: "Gap extension cost (lastal -b)"
"last -A":
title: "Insertion existence"
description: "Insertion existence cost (lastal -A)"
"last -B":
title: "Insertion extension"
description: "Insertion extension cost (lastal -B)"
last_o2o:
file_format: "tsv"
section_name: "Alignment statistics"
plot_type: "table"
headers:
id:
title: "ID"
description: "target__query"
TotalAlignmentLength:
title: "Total alignment length"
PercentSimilarity:
title: "Percent similarity"

sp:
last_o2o:
Expand Down
2 changes: 1 addition & 1 deletion assets/schema_input.json
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@
"format": "file-path",
"exists": true,
"pattern": "^\\S+\\.f(ast|n)?a(\\.gz)?$",
"errorMessage": "Fasta file for genomes must be provided, cannot contain spaces and must have extension '.fa', '.fa.gz', '.fna', '.fna.gz', '.fasta' or '.fasta.gz'"
"errorMessage": "Fasta file for genomes must be provided, cannot contain spaces and must have extension `.fa`, `.fa.gz`, `.fna`, `.fna.gz`, `.fasta` or `.fasta.gz`"
}
},
"required": ["sample", "fasta"]
Expand Down
Loading

0 comments on commit 95a1eff

Please sign in to comment.