Skip to content

Commit

Permalink
Merge pull request #1 from phac-nml/de-gasclustering-pipeline
Browse files Browse the repository at this point in the history
De gasclustering pipeline
  • Loading branch information
sgsutcliffe authored Dec 10, 2024
2 parents 5b13810 + ff3f028 commit cbea074
Show file tree
Hide file tree
Showing 38 changed files with 91 additions and 569 deletions.
18 changes: 9 additions & 9 deletions .github/CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -1,17 +1,17 @@
# phac-nml/gasclustering: Contributing Guidelines
# phac-nml/fastmatchirida: Contributing Guidelines

Hi there!
Many thanks for taking an interest in improving phac-nml/gasclustering.
Many thanks for taking an interest in improving phac-nml/fastmatchirida.

We try to manage the required tasks for phac-nml/gasclustering using GitHub issues, you probably came to this page when creating one.
We try to manage the required tasks for phac-nml/fastmatchirida using GitHub issues, you probably came to this page when creating one.
Please use the pre-filled template to save time.

## Contribution workflow

If you'd like to write some code for phac-nml/gasclustering, the standard workflow is as follows:
If you'd like to write some code for phac-nml/fastmatchirida, the standard workflow is as follows:

1. Check that there isn't already an issue about your idea in the [phac-nml/gasclustering issues](https://github.com/phac-nml/gasclustering/issues) to avoid duplicating work. If there isn't one already, please create one so that others know you're working on this
2. [Fork](https://help.github.com/en/github/getting-started-with-github/fork-a-repo) the [phac-nml/gasclustering repository](https://github.com/phac-nml/gasclustering) to your GitHub account
1. Check that there isn't already an issue about your idea in the [phac-nml/fastmatchirida issues](https://github.com/phac-nml/fastmatchirida/issues) to avoid duplicating work. If there isn't one already, please create one so that others know you're working on this
2. [Fork](https://help.github.com/en/github/getting-started-with-github/fork-a-repo) the [phac-nml/fastmatchirida repository](https://github.com/phac-nml/fastmatchirida) to your GitHub account
3. Make the necessary changes / additions within your forked repository following [Pipeline conventions](#pipeline-contribution-conventions)
4. Use `nf-core schema build` and add any new parameters to the pipeline JSON schema (requires [nf-core tools](https://github.com/nf-core/tools) >= 1.10).
5. Submit a Pull Request against the `dev` branch and wait for the code to be reviewed and merged
Expand Down Expand Up @@ -52,11 +52,11 @@ These tests are run both with the latest available version of `Nextflow` and als

## Getting help

For further information/help, please consult the [phac-nml/gasclustering documentation](https://github.com/phac-nml/gasclustering/).
For further information/help, please consult the [phac-nml/fastmatchirida documentation](https://github.com/phac-nml/fastmatchirida/).

## Pipeline contribution conventions

To make the phac-nml/gasclustering code and processing logic more understandable for new contributors and to ensure quality, we semi-standardise the way the code and other contributions are written.
To make the phac-nml/fastmatchirida code and processing logic more understandable for new contributors and to ensure quality, we semi-standardise the way the code and other contributions are written.

### Adding a new step

Expand Down Expand Up @@ -105,7 +105,7 @@ This repo includes a devcontainer configuration which will create a GitHub Codes

To get started:

- Open the repo in [Codespaces](https://github.com/phac-nml/gasclustering/codespaces)
- Open the repo in [Codespaces](https://github.com/phac-nml/fastmatchirida/codespaces)
- Tools installed
- nf-core
- Nextflow
Expand Down
4 changes: 2 additions & 2 deletions .github/ISSUE_TEMPLATE/bug_report.yml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
name: Bug report
description: Report something that is broken or incorrect
labels: bug
labels: [bug]
body:
- type: markdown
attributes:
Expand Down Expand Up @@ -44,4 +44,4 @@ body:
* Executor _(eg. slurm, local, awsbatch)_
* Container engine: _(e.g. Docker, Singularity, Conda, Podman, Shifter, Charliecloud, or Apptainer)_
* OS _(eg. CentOS Linux, macOS, Linux Mint)_
* Version of phac-nml/gasclustering _(eg. 1.1, 1.5, 1.8.2)_
* Version of phac-nml/fastmatchirida _(eg. 1.1, 1.5, 1.8.2)_
2 changes: 1 addition & 1 deletion .github/ISSUE_TEMPLATE/config.yml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
contact_links:
- name: "GitHub"
url: https://github.com/phac-nml/gasclustering
url: https://github.com/phac-nml/fastmatchirida
about: The GitHub page for development.
2 changes: 1 addition & 1 deletion .github/ISSUE_TEMPLATE/feature_request.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
name: Feature request
description: Suggest an idea for the phac-nml/gasclustering pipeline
description: Suggest an idea for the phac-nml/fastmatchirida pipeline
labels: enhancement
body:
- type: textarea
Expand Down
8 changes: 4 additions & 4 deletions .github/PULL_REQUEST_TEMPLATE.md
Original file line number Diff line number Diff line change
@@ -1,21 +1,21 @@
<!--
# phac-nml/gasclustering pull request
# phac-nml/fastmatchirida pull request
Many thanks for contributing to phac-nml/gasclustering!
Many thanks for contributing to phac-nml/fastmatchirida!
Please fill in the appropriate checklist below (delete whatever is not relevant).
These are the most common things requested on pull requests (PRs).
Remember that PRs should be made against the dev branch, unless you're preparing a pipeline release.
Learn more about contributing: [CONTRIBUTING.md](https://github.com/phac-nml/gasclustering/tree/main/.github/CONTRIBUTING.md)
Learn more about contributing: [CONTRIBUTING.md](https://github.com/phac-nml/fastmatchirida/tree/main/.github/CONTRIBUTING.md)
-->

## PR checklist

- [ ] This comment contains a description of changes (with reason).
- [ ] If you've fixed a bug or added code that should be tested, add tests!
- [ ] If you've added a new tool - have you followed the pipeline conventions in the [contribution docs](https://github.com/phac-nml/gasclustering/tree/main/.github/CONTRIBUTING.md)
- [ ] If you've added a new tool - have you followed the pipeline conventions in the [contribution docs](https://github.com/phac-nml/fastmatchirida/tree/main/.github/CONTRIBUTING.md)
- [ ] Make sure your code lints (`nf-core lint`).
- [ ] Ensure the test suite passes (`nextflow run . -profile test,docker --outdir <OUTDIR>`).
- [ ] Check for unexpected warnings in debug mode (`nextflow run . -profile debug,test,docker --outdir <OUTDIR>`).
Expand Down
4 changes: 2 additions & 2 deletions .github/workflows/branch.yml
Original file line number Diff line number Diff line change
Expand Up @@ -11,9 +11,9 @@ jobs:
steps:
# PRs to the phac-nml repo main branch are only ok if coming from the phac-nml repo `dev` or any `patch` branches
- name: Check PRs
if: github.repository == 'phac-nml/gasclustering'
if: github.repository == 'phac-nml/fastmatchirida'
run: |
{ [[ ${{github.event.pull_request.head.repo.full_name }} == phac-nml/gasclustering ]] && [[ $GITHUB_HEAD_REF == "dev" ]]; } || [[ $GITHUB_HEAD_REF == "patch" ]]
{ [[ ${{github.event.pull_request.head.repo.full_name }} == phac-nml/fastmatchirida ]] && [[ $GITHUB_HEAD_REF == "dev" ]]; } || [[ $GITHUB_HEAD_REF == "patch" ]]
# If the above check failed, post a comment on the PR explaining the failure
# NOTE - this doesn't currently work if the PR is coming from a fork, due to limitations in GitHub actions secrets
Expand Down
4 changes: 2 additions & 2 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ jobs:
test:
name: Run pipeline with test data
# Only run on push if this is the phac-nml dev branch (merged PRs)
if: "${{ github.event_name != 'push' || (github.event_name == 'push' && github.repository == 'phac-nml/gasclustering') }}"
if: "${{ github.event_name != 'push' || (github.event_name == 'push' && github.repository == 'phac-nml/fastmatchirida') }}"
runs-on: ubuntu-latest
strategy:
matrix:
Expand All @@ -31,7 +31,7 @@ jobs:
uses: actions/checkout@b4ffde65f46336ab88eb53be808477a3936bae11 # v4

- name: Install Nextflow
uses: nf-core/setup-nextflow@v1
uses: nf-core/setup-nextflow@v2
with:
version: "${{ matrix.NXF_VER }}"

Expand Down
8 changes: 4 additions & 4 deletions .nf-core.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,15 +2,15 @@ repository_type: pipeline
nf_core_version: "3.0.1"
lint:
files_exist:
- assets/nf-core-gasclustering_logo_light.png
- docs/images/nf-core-gasclustering_logo_light.png
- docs/images/nf-core-gasclustering_logo_dark.png
- assets/nf-core-fastmatchirida_logo_light.png
- docs/images/nf-core-fastmatchirida_logo_light.png
- docs/images/nf-core-fastmatchirida_logo_dark.png
- .github/workflows/awstest.yml
- .github/workflows/awsfulltest.yml
- lib/Utils.groovy
- lib/WorkflowMain.groovy
- lib/NfcoreTemplate.groovy
- lib/WorkflowGasclustering.groovy
- lib/Workflowfastmatchirida.groovy
files_unchanged:
- assets/sendmail_template.txt
- assets/email_template.html
Expand Down
50 changes: 3 additions & 47 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,54 +1,10 @@
# phac-nml/gasclustering: Changelog
# phac-nml/fastmatchirida: Changelog

The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/)
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [0.4.0] - 2024-11-07
## [0.1.0] - In Development

- Added the ability to include a `sample_name` column in the input samplesheet.csv. Allows for compatibility with IRIDA-Next input configuration.
fastmatchirida is built using Gasclustering [0.4.0] as a template.

- `sample_name` special characters (non-alphanumeric with exception of "_" and ".") will be replaced with `"_"`
- If no `sample_name` is supplied in the column `sample` will be used
- To avoid repeat values for `sample_name` all `sample_name` values will be suffixed with the unique `sample` value from the input file

- Fixed linting issues in CI caused by nf-core 3.0.1

## [0.3.0] - 2024-09-10

### Changed

- Upgraded `profile_dist` container to version `1.0.2`
- Upgraded `locidex/merge` to version `0.2.3` and updated `input_assure` and test data for compatibility with the new `mlst.json` allele file format.
- [PR28](https://github.com/phac-nml/gasclustering/pull/28)
- Aligned container registry handling in configuration files and modules with `phac-nml/pipeline-standards`
- [PR28](https://github.com/phac-nml/gasclustering/pull/28)

This pipeline is now compatible only with output generated by [Locidex v0.2.3+](https://github.com/phac-nml/locidex) and [Mikrokondo v0.4.0+](https://github.com/phac-nml/mikrokondo/releases/tag/v0.4.0).

## [0.2.0] - 2024-06-26

### Added

- Support for mismatched IDs between the samplesheet ID and the ID listed in the corresponding allele file.

### Changed

- Updated ArborView to v0.0.7-rc1.

### Fixed

- The scaled distance thresholds provided when using `--pd_distm scaled` and `--gm_thresholds` are now correctly understood as percentages in the range [0.0, 100.0].

## [0.1.0] - 2024-05-28

Initial release of the Genomic Address Service Clustering pipeline to be used for distance-based clustering of cg/wgMLST data.

### `Added`

- Input of cg/wgMLST allele calls produced from [locidex](https://github.com/phac-nml/locidex).
- Output of a dendrogram, cluster codes, and visualization using [profile_dists](https://github.com/phac-nml/profile_dists), [gas mcluster](https://github.com/phac-nml/genomic_address_service), and ArborView.

[0.1.0]: https://github.com/phac-nml/gasclustering/releases/tag/0.1.0
[0.2.0]: https://github.com/phac-nml/gasclustering/releases/tag/0.2.0
[0.3.0]: https://github.com/phac-nml/gasclustering/releases/tag/0.3.0
[0.4.0]: https://github.com/phac-nml/gasclustering/releases/tag/0.4.0
4 changes: 2 additions & 2 deletions CITATIONS.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# phac-nml/gasclustering: Citations
# phac-nml/fastmatchirida: Citations

## [nf-core](https://pubmed.ncbi.nlm.nih.gov/32055031/)

Expand All @@ -22,7 +22,7 @@

> Robertson, James, Wells, Matthew, Schonfeld, Justin, Reimer, Aleisha. Genomic Address Service: Convenient package for de novo clustering and sample assignment to existing clusters. 2023. https://github.com/phac-nml/genomic_address_service
- [ArborView (included in repository)](https://github.com/phac-nml/gasclustering/blob/dev/assets/ArborView.html) (in-development, citation subject to change)
- [ArborView (included in repository)](https://github.com/phac-nml/fastmatchirida/blob/dev/assets/ArborView.html) (in-development, citation subject to change)

## Software packaging/containerisation tools

Expand Down
8 changes: 4 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[![Nextflow](https://img.shields.io/badge/nextflow-%E2%89%A523.04.3-brightgreen.svg)](https://www.nextflow.io/)

# Genomic Address Service Clustering Workflow
# FastMatch IRIDA Workflow

This workflow takes provided JSON-formatted MLST profiles and converts them into a phylogenetic tree with associated flat cluster codes for use in [Irida Next](https://github.com/phac-nml/irida-next). The workflow also generates an interactive tree for visualization.

Expand All @@ -18,7 +18,7 @@ The structure of this file is defined in [assets/schema_input.json](assets/schem

## IRIDA-Next Optional Input Configuration

`gasclustering` accepts the [IRIDA-Next](https://github.com/phac-nml/irida-next) format for samplesheets which can contain an additional column: `sample_name`
`fastmatchirida` accepts the [IRIDA-Next](https://github.com/phac-nml/irida-next) format for samplesheets which can contain an additional column: `sample_name`

`sample_name`: An **optional** column, that overrides `sample` for outputs (filenames and sample names) and reference assembly identification.

Expand Down Expand Up @@ -87,7 +87,7 @@ Other parameters (defaults from nf-core) are defined in [nextflow_schema.json](n
To run the pipeline, please do:
```bash
nextflow run phac-nml/gasclustering -profile singularity -r main -latest --input https://github.com/phac-nml/gasclustering/raw/dev/assets/samplesheet.csv --outdir results
nextflow run phac-nml/fastmatchirida -profile singularity -r main -latest --input https://github.com/phac-nml/fastmatchirida/raw/dev/assets/samplesheet.csv --outdir results
```

Where the `samplesheet.csv` is structured as specified in the [Input](#input) section.
Expand Down Expand Up @@ -157,7 +157,7 @@ Details on the individual output files can be found in the [Output documentation
To run with the test profile, please do:

```bash
nextflow run phac-nml/gasclustering -profile docker,test -r main -latest --outdir results
nextflow run phac-nml/fastmatchirida -profile docker,test -r main -latest --outdir results
```

# Legal
Expand Down
2 changes: 1 addition & 1 deletion assets/adaptivecard.json
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@
"size": "Large",
"weight": "Bolder",
"color": "<% if (success) { %>Good<% } else { %>Attention<%} %>",
"text": "phac-nml/gasclustering v${version} - ${runName}",
"text": "phac-nml/fastmatchirida v${version} - ${runName}",
"wrap": true
},
{
Expand Down
14 changes: 7 additions & 7 deletions assets/email_template.html
Original file line number Diff line number Diff line change
Expand Up @@ -4,21 +4,21 @@
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<meta name="viewport" content="width=device-width, initial-scale=1">

<meta name="description" content="phac-nml/gasclustering: Example IRIDA Next pipeline">
<title>phac-nml/gasclustering Pipeline Report</title>
<meta name="description" content="phac-nml/fastmatchirida: Example IRIDA Next pipeline">
<title>phac-nml/fastmatchirida Pipeline Report</title>
</head>
<body>
<div style="font-family: Helvetica, Arial, sans-serif; padding: 30px; max-width: 800px; margin: 0 auto;">

<img src="cid:nfcorepipelinelogo">

<h1>phac-nml/gasclustering v${version}</h1>
<h1>phac-nml/fastmatchirida v${version}</h1>
<h2>Run Name: $runName</h2>

<% if (!success){
out << """
<div style="color: #a94442; background-color: #f2dede; border-color: #ebccd1; padding: 15px; margin-bottom: 20px; border: 1px solid transparent; border-radius: 4px;">
<h4 style="margin-top:0; color: inherit;">phac-nml/gasclustering execution completed unsuccessfully!</h4>
<h4 style="margin-top:0; color: inherit;">phac-nml/fastmatchirida execution completed unsuccessfully!</h4>
<p>The exit status of the task that caused the workflow execution to fail was: <code>$exitStatus</code>.</p>
<p>The full error message was:</p>
<pre style="white-space: pre-wrap; overflow: visible; margin-bottom: 0;">${errorReport}</pre>
Expand All @@ -27,7 +27,7 @@ <h4 style="margin-top:0; color: inherit;">phac-nml/gasclustering execution compl
} else {
out << """
<div style="color: #3c763d; background-color: #dff0d8; border-color: #d6e9c6; padding: 15px; margin-bottom: 20px; border: 1px solid transparent; border-radius: 4px;">
phac-nml/gasclustering execution completed successfully!
phac-nml/fastmatchirida execution completed successfully!
</div>
"""
}
Expand All @@ -44,8 +44,8 @@ <h3>Pipeline Configuration:</h3>
</tbody>
</table>

<p>phac-nml/gasclustering</p>
<p><a href="https://github.com/phac-nml/gasclustering">https://github.com/phac-nml/gasclustering</a></p>
<p>phac-nml/fastmatchirida</p>
<p><a href="https://github.com/phac-nml/fastmatchirida">https://github.com/phac-nml/fastmatchirida</a></p>

</div>

Expand Down
10 changes: 5 additions & 5 deletions assets/email_template.txt
Original file line number Diff line number Diff line change
Expand Up @@ -4,15 +4,15 @@
|\\ | |__ __ / ` / \\ |__) |__ } {
| \\| | \\__, \\__/ | \\ |___ \\`-._,-`-,
`._,._,'
phac-nml/gasclustering v${version}
phac-nml/fastmatchirida v${version}
----------------------------------------------------
Run Name: $runName

<% if (success){
out << "## phac-nml/gasclustering execution completed successfully! ##"
out << "## phac-nml/fastmatchirida execution completed successfully! ##"
} else {
out << """####################################################
## phac-nml/gasclustering execution completed unsuccessfully! ##
## phac-nml/fastmatchirida execution completed unsuccessfully! ##
####################################################
The exit status of the task that caused the workflow execution to fail was: $exitStatus.
The full error message was:
Expand All @@ -35,5 +35,5 @@ Pipeline Configuration:
<% out << summary.collect{ k,v -> " - $k: $v" }.join("\n") %>

--
phac-nml/gasclustering
https://github.com/phac-nml/gasclustering
phac-nml/fastmatchirida
https://github.com/phac-nml/fastmatchirida
6 changes: 3 additions & 3 deletions assets/methods_description_template.yml
Original file line number Diff line number Diff line change
@@ -1,11 +1,11 @@
id: "nf-core-iridanext-methods-description"
description: "Suggested text and references to use when describing pipeline usage within the methods section of a publication."
section_name: "phac-nml/gasclustering Methods Description"
section_href: "https://github.com/phac-nml/gasclustering"
section_name: "phac-nml/fastmatchirida Methods Description"
section_href: "https://github.com/phac-nml/fastmatchirida"
plot_type: "html"
data: |
<h4>Methods</h4>
<p>Data was processed using phac-nml/gasclustering v${workflow.manifest.version} ${doi_text} of the nf-core collection of workflows (<a href="https://doi.org/10.1038/s41587-020-0439-x">Ewels <em>et al.</em>, 2020</a>), utilising reproducible software environments from the Bioconda (<a href="https://doi.org/10.1038/s41592-018-0046-7">Grüning <em>et al.</em>, 2018</a>) and Biocontainers (<a href="https://doi.org/10.1093/bioinformatics/btx192">da Veiga Leprevost <em>et al.</em>, 2017</a>) projects.</p>
<p>Data was processed using phac-nml/fastmatchirida v${workflow.manifest.version} ${doi_text} of the nf-core collection of workflows (<a href="https://doi.org/10.1038/s41587-020-0439-x">Ewels <em>et al.</em>, 2020</a>), utilising reproducible software environments from the Bioconda (<a href="https://doi.org/10.1038/s41592-018-0046-7">Grüning <em>et al.</em>, 2018</a>) and Biocontainers (<a href="https://doi.org/10.1093/bioinformatics/btx192">da Veiga Leprevost <em>et al.</em>, 2017</a>) projects.</p>
<p>The pipeline was executed with Nextflow v${workflow.nextflow.version} (<a href="https://doi.org/10.1038/nbt.3820">Di Tommaso <em>et al.</em>, 2017</a>) with the following command:</p>
<pre><code>${workflow.commandLine}</code></pre>
<p>${tool_citations}</p>
Expand Down
8 changes: 4 additions & 4 deletions assets/multiqc_config.yml
Original file line number Diff line number Diff line change
@@ -1,13 +1,13 @@
report_comment: >
This report has been generated by the <a href="https://github.com/phac-nml/gasclustering/releases/tag/dev" target="_blank">phac-nml/gasclustering</a>
This report has been generated by the <a href="https://github.com/phac-nml/fastmatchirida/releases/tag/dev" target="_blank">phac-nml/fastmatchirida</a>
analysis pipeline. For information about how to interpret these results, please see the
<a href="https://github.com/phac-nml/gasclustering/" target="_blank">documentation</a>.
<a href="https://github.com/phac-nml/fastmatchirida/" target="_blank">documentation</a>.
report_section_order:
"phac-nml-gasclustering-methods-description":
"phac-nml-fastmatchirida-methods-description":
order: -1000
software_versions:
order: -1001
"phac-nml-gasclustering-summary":
"phac-nml-fastmatchirida-summary":
order: -1002

export_plots: true
Loading

0 comments on commit cbea074

Please sign in to comment.