Releases: Plant-Food-Research-Open/genepal
Releases · Plant-Food-Research-Open/genepal
Version 0.6.0
What's Changed
Added
- Added cDNA and CDS outputs to <OUTPUT_DIR>/annotations/ directory #118
- Added parameter
add_attrs_to_proteins_cds_fastas
- Added parameter
filter_genes_by_aa_length
with default set to24
which allows removal of genes with ORFs shorter than 24 #125
Fixed
- Fixed an issue where TSEBRA failed because LIFTOFF lifted non-protein coding genes #121
- Switched branch name from
master
tomain
in the GHA CIs - Fixed an issue in
genepal_report.Rmd
which caused the pangene matrix plot to fail when the number of clusters exceeded 65536 #124 - Fixed an issue where
GENEPALREPORT
process failed due to OOM kill signal from SLURM #123 - Fixed an issue where Gff merge after liftoff failed when one of the Gff files did not contain any genes
- Fixed an issue where
gxf_fasta_agat_spaddintrons_spextractsequences
crashed due to short introns #89
Dependencies
- Nextflow!>=24.04.2
- [email protected]
Deprecated
- Removed parameter
add_attrs_to_proteins_fasta
PRs
- Add gffread EXTRACT_CDNA and EXTRACT_CDS feature to outputs by @liamlelievre in #119
- Fixed TSEBRA failure issue by @GallVp in #122
- Fixed issues in genepal-report by @GallVp in #126
- Added parameter filter_genes_by_aa_length by @GallVp in #127
- Fixed post-liftoff merge by @GallVp in #130
- Fixed a crash due to short introns by @GallVp in #131
- Release candidate for 0.6.0 by @GallVp in #129
New Contributors
- @liamlelievre made their first contribution in #119
Full Changelog: 0.5.0...0.6.0
Version 0.5.0
What's Changed
Added
- Added MultiQC #65
- Updated nf-core template to 3.0.2 #66
- Integrated nf-test into pipeline CI #68
- Updated the flowchart #87
- Added a large test dataset for the
test_full
profile #90 - Now
.gff.gz
and.gff3.gz
inputs are also allowed for thebenchmark
column in--input
- Now removing liftoff genes with any intron shorted than 10bp #89
- Now also removing
rRNA
andtRNA
after liftoff as the downstream logic in the pipeline can not correctly handle these - Now skipping FastQC by default #98
- Added an HTML report #44
- Added content type as text/html for the MultiQC and genepal reports
- Added sra-tools for RNASeq data download #102
Fixed
- Now using
${meta.id}_trim
as prefix forFASTQC
files - Updated citations to include DOIs
- Fixed a bug where FASTQ versions were not correctly captured
- Now using the correct out channel from
STAR_ALIGN
. This bug was introduced by a module update during the development of this version #74 - Fixed OrthoFinder results copy failure on AWS #108
Dependencies
- Nextflow!>=24.04.2
- [email protected]
Deprecated
- Resource parameters have been removed:
max_memory
,max_cpus
,max_time
- Removed a number of unnecessary parameters:
monochromeLogs
,config_profile_contact
,config_profile_url
,validationFailUnrecognisedParams
,validationLenientMode
,validationSchemaIgnoreParams
,validationShowHiddenParams
,validate_params
- Removed
extra_fastp_args
and replaced it withfastp_extra_args
- Removed and replaced
skip_fastp
andskip_fastqc
withfastp_skip
andfastqc_skip
#82
PRs
- Updated nf-core template to 3.0.2 by @GallVp in #67
- Integrated nf-test into pipeline CI by @GallVp in #71
- Updated docs to include -r flag by @GallVp in #72
- Now using the correct out channel from STAR_ALIGN by @GallVp in #78
- Removed extra_fastp_args and replaced it with fastp_extra_args by @GallVp in #81
- Removed and replaced skip_fastp and skip_fastqc by @GallVp in #83
- Updated the flowchart by @GallVp in #88
- Added a large dataset for test_full by @GallVp in #91
- Now skipping FastQC by default by @GallVp in #99
- Added an HTML report by @GallVp in #100
- Added content type as text/html for the MultiQC and genepal reports by @GallVp in #101
- Added sra-tools for RNASeq data download by @GallVp in #103
- Fixed minor issues in report modules by @GallVp in #106
- Fixed OrthoFinder results copy failure on AWS by @GallVp in #109
- Added doi and bumped version by @GallVp in #110
- Removed an unnecessary config block by @GallVp in #111
- Fixed linting issues by @GallVp in #113
- Added GeneMark license info by @GallVp in #114
- Candidate for 0.5.0 by @GallVp in #112
Full Changelog: 0.4.0...0.5.0
Version 0.4.0
What's Changed
Added
- Added
orthofinder_annotations
param - Added
FASTA_GFF_ORTHOFINDER
sub-workflow - Added evaluation by BUSCO #41
- Included common tax ids for eggnog mapper #27
- Implemented hierarchical naming scheme: geneI.tJ, geneI.tJ.exonK, geneI.tJ.cdsK #19, #34
- Now sorting list of bam and list of fastq before cat to avoid resume cache misses
- Allowed BAM files for RNA evidence #3
- Added
GXF_FASTA_AGAT_SPADDINTRONS_SPEXTRACTSEQUENCES
sub-workflow for splice type statistics #11 - Changed
orthofinder_annotations
from FASTA/GFF to protein FASTA #43 - Added param
enforce_full_intron_support
to turn on/off strict model purging by TSEBRA #21 - Added param
filter_liftoff_by_hints
to evaluate liftoff models with TSEBRA to make sure they have the same level of evidence as BRAKER #28 - Added a script to automatically check module version updates
- Reduced
BRAKER3
threads to 8 #55 - Now the final annotations are stored in the
annotations
folder #53 - Now a single
fasta
file can be directly specified forprotein_evidence
eggnogmapper_db_dir
is not a required parameter anymoreeggnogmapper_tax_scope
is now set to 1 (root div) by default- Added a
test
profile based on public data - Added parameter
add_attrs_to_proteins_fasta
to enable/disable addition of decoded gff attributes to proteins fasta #58 - Added a check for input assemblies. If an assembly is smaller than 1 MB (or 300KB in zipped format), the pipeline errors out before starting the downstream processes #47
- Now
REPEATMASKER
GFF output is saved viaCUSTOM_RMOUTTOGFF3
#54 - Added
benchmark
column to the input sheet and usedGFFCOMPARE
to perform benchmarking #63 - Added
SEQKIT_RMDUP
to detect duplicate sequence and wrap the fasta to 80 characters - Updated parameter section labels for annotation and post-annotation filtering #64
- Updated modules and sub-workflows
Fixed
- Fixed BRAKER spellings #36
- Fixed liftoff failure when lifting off from a single reference #40
- Added versions from GFF_STORE sub-workflows #33
Dependencies
- NextFlow!>=23.04.4
- nf-validation=1.1.3
Deprecated
- Renamed
external_protein_fastas
param toprotein_evidence
- Renamed
fastq
param torna_evidence
- Renamed
braker_allow_isoforms
param toallow_isoforms
- Moved liftoffID from gene level to mRNA/transcript level
- Moved
version_check.sh
to.github/version_checks.sh
- Removed dependency on https://github.com/kherronism/nf-modules.git for
BRAKER3
andREPEATMASKER
modules which are now installed from https://github.com/GallVp/nxf-components.git - Removed dependency on https://github.com/PlantandFoodResearch/nxf-modules.git
- Now the final annotations are not stored in the
final
folder - Now BRAKER3 outputs are not saved by default #53 and saved under
etc
folder when enabled - Removed
local
profile. Local executor is the default when no executor is specified. Therefore, thelocal
profile was not needed. - Removed
CUSTOM_DUMPSOFTWAREVERSIONS
Full Changelog: 0.3.3...0.4.0
Version 0.3.3
What's Changed
Full Changelog: 0.3.2...0.3.3
Added
- Added a stub test to evaluate the case where an assembly is soft masked but has no annotations
Fixed
- Fixed a bug where
is_masked
was ignored by the pipeline - Fixed a bug in param validation which allowed specification of
braker_hints
withoutbraker_gff3
Dependencies
- NextFlow!>=23.04.4
- nf-validation=1.1.3
Deprecated
Version 0.3.2
What's Changed
Full Changelog: 0.3.1...0.3.2
Version 0.3.1
What's Changed
Full Changelog: 0.3.0...0.3.1
Version 0.3.0
What's Changed
Commit history: v0.2...0.3.0
Added
- Added changelog and semantic versioning
- Changed license to MIT
- Updated
.editorconfig
- Moved .literature to test/ branch
- Renamed
pangene_local
tolocal_pangene
- Renamed
pangene_pfr
topfr_pangene
- Added versioning checking
- Updated github workflow to use pre-commit instead of prettier and editorconfig check
- Added central singularity cache dir for pfr config
- Added
SORTMERNA_INDEX
beforeSORTMERNA
- Fixed sample contamination bug introduced by
file.simpleName
- Now using empty files for stub testing in CI
- Now BRAKER can be skipped by including BRAKER outputs from previous runs in the
target_assemblies
param - Added
gffcompare
to merge liftoff annotations - Renamed
samplesheet
param tofastq
- Now using assemblysheet in combination with nf-validation for assembly input
- Added nextflow_schema.json
- Now using nf-validation to validate fastqsheet provided by params.fastq
- Moved
manifest.config
andreporting_defaults.config
content tonextflow.config
- Now using a txt file for
params.external_protein_fastas
- Now using nf-validation for
params.liftoff_annotations
- Now using nf-validation for all the parameters
- Added
PURGE_BREAKER_MODELS
sub-workflow - Added
GFF_EGGNOGMAPPER
sub-workflow - Now using a custom version of
GFFREAD
which supportsmeta
andfasta
- Now using TSEBRA to purge models which do not have full intron support from BRAKER hints
- Added params
eggnogmapper_evalue
andeggnogmapper_pident
- Added
PURGE_NOHIT_BRAKER_MODELS
sub-workflow - Now merging BRAKER and liftoff models before running eggnogmapper
- Added
GFF_MERGE_CLEANUP
sub-workflow - Now using
description
field to store notes and textual annotations in the gff files - Now using
mRNA
in place oftranscript
in gff files - Now
eggnogmapper_purge_nohits
is set tofalse
by default - Added
GFF_STORE
sub workflow external_protein_fastas
andeggnogmapper_db_dir
are not mandatory parameters- Added contributors
- Add a document for the pipeline parameters
- Updated
pfr_pangene
andpfr/profile.config
- Now using local tests/stub files for GitHub CI
- Now removing iso-forms left by TSEBRA using
AGAT_SPFILTERFEATUREFROMKILLLIST
- Added
pyproject.toml
- Now using PFAMs from eggnog if description is '-'
Fixed
- Removed liftoff models with
valid_ORF=False
- Updated license text to include 'Copyright (c) 2024 The New Zealand Institute for Plant and Food Research Limited'
Dependencies
- NextFlow!>=23.04.4
- nf-validation=1.1.3