Skip to content

Commit

Permalink
Fix a regression, update README and call it 3.0.0
Browse files Browse the repository at this point in the history
  • Loading branch information
charles-plessy committed Oct 8, 2024
1 parent c6621f0 commit 7f73852
Show file tree
Hide file tree
Showing 3 changed files with 9 additions and 6 deletions.
3 changes: 2 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,13 +3,14 @@
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/)
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## v2.1.0 - unreleased (tbd)
## v3.0.0 - October 8th, 2024 (Chaenocephalus aceratus)

- Run assemblyscan on the filtered genomes.
- Allocate only a single CPU for 1 hour with 6 Gb memory for all computations.
- Collect contig names to better check if sex chromosomes are missing from the assembly, etc.
- Replace `seqkit` with shell commands and `seqtk` because of memory usage
(https://github.com/shenwei356/seqkit/issues/487).
This changes directory and names of output files.

## v2.0.0 - September 24th, 2024 (Lama glama)

Expand Down
6 changes: 3 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,10 +5,10 @@
**oist/LuscombeU_stlpreprocess** is a bioinformatics pipeline that ...

1. Extract chromosomal scaffolds from the assembly file (discard unplaced, alternate, organelle sequences, etc.).
2. Unmask the genome (to be re-masked later by another local pipeline.
3. Extract mitochondrial genomes from the assembly file (they might be useful later as an internal control).
2. Unmask the genome (to be re-masked later by another local pipeline).
3. Extract complete mitochondrial genomes from the assembly file (they might be useful later as an internal control).
4. Summarises the occurence of the first two letters of the accession numbers, to ease future changes of the grepping pattern for whole-chromosome scaffolds.
5. Record the name of the extracted contigs, for instance to check if sex chromosomes are missing from the assembly.
5. Record the name of the contigs, for instance to check if sex chromosomes are missing from the assembly.
6. Show in the MultiQC report some assembly statistics such as GC content and contig length extracted with the <https://github.com/rpetit3/assembly-scan> software.

## Usage
Expand Down
6 changes: 4 additions & 2 deletions modules/local/filter.nf
Original file line number Diff line number Diff line change
Expand Up @@ -44,9 +44,11 @@ process FILTER {
grep -i mitochondri |
seqtk subseq $genome - | gzip --best --no-name > ${prefix}.mitogenome.fa.gz
# Remove outputs if empty (for some genomes the pattern does match chromosome-level scaffold accession numbers)
[ -z "\$(zcat ${prefix}.mitogenome.fa.gz | head)" ] && rm ${prefix}.mitogenome.fa.gz
# Remove mitogenome file if containing less or more than one sequence
[ \$(zcat ${prefix}.mitogenome.fa.gz | grep -c '>') -ne 1 ] && rm ${prefix}.mitogenome.fa.gz
# Remove outputs if empty (for some genomes the pattern does match chromosome-level scaffold accession numbers)
# And then, remove soft masks.
if [ -z "\$(zcat ${prefix}.chromosomes.fa.gz | head)" ]
then
rm ${prefix}.chromosomes.fa.gz
Expand Down

0 comments on commit 7f73852

Please sign in to comment.