Releases: zellerlab/GECCO
Releases · zellerlab/GECCO
0.8.4
0.8.3-post1
Fixed
- Wrong default value for
--threshold
being shown ingecco run
help message.
0.8.3
0.8.2
Fixed
gecco run
crashing on Python 3.6 because of missingcontextlib.nullcontext
class.
Changed
gecco run
andgecco annotate
will not try to count the number of profiles when given an external HMM file with the--hmm
flag.PyHMMER.run
now reports the p-value of each domain in addition to the e-value as a/note
qualifier.
0.8.1
Changed
gecco run
now filters out unneeded features before annotating, making it easier to analyze the results of a run with a custom--model
.
Fixed
gecco
reporting about using Pfamv33.1
while actually usingv34.0
because of an outdated field ingecco/hmmer/Pfam.ini
.
Added
- Missing documentation for the
strand
attribute ofgecco.model.Gene
.
0.8.0
Changed
- Retrain internal model using new sequence embeddings and remove broken/duplicate BGCs from MIBiG 2.0.
- Bump minimum
pyhmmer
version tov0.4.0
to improve exception handling. - Bump minimum
pyrodigal
version tov0.5.0
to fix sequence decoding on some platforms. - Use p-values instead of e-values to filter domains obtained with HMMER.
gecco cv
andgecco train
now seed the RNG with a user-defined seed before shuffling rows of training data.
Fixed
- Extraction of BGC compositions for the type predictor while training.
ClusterCRF.trained
failing to open an external model.
Added
Domain.pvalue
attribute to access the p-value of a domain annotation.- Mandatory
pvalue
column toFeatureTable
objects. - Support for loading several feature tables in
gecco train
andgecco cv
. - Warnings to
ClusterCRF.fit
when selecting uninformative features. --correction
flag togecco train
andgecco cv
, allowing to give a multiple testing correction method when computing p-values with the Fisher Exact Tests.
Removed
- Outdated
gecco embed
command. - Unused
--truncate
flag from thegecco train
CLI. - Tigrfam domains, which is not improving performance on the new training data.
0.7.0
Added
- Support for writing an AntiSMASH sideload JSON file after a
gecco run
workflow. - Code for converting GenBank files in BiG-SLiCE compatible format with the
gecco convert
subcommand. - Documentation about using GECCO in combination with AntiSMASH or BiG-SLiCE.
Changed
- Minimum Biopython version to
v1.73
for compatibility with older bioinformatics tooling. - Internal domain composition shipped in the
gecco.types
with newer composition array obtained directly from MIBiG files.
Removed
- Outdated notice about
-vvv
verbosity level in the help message of the maingecco
command.
0.6.3
Fixed
- HMMER annotation not properly handling inputs with multiple contigs.
- Some progress bar totals displaying as floats in the CLI.
Changed
PyHMMER
now sets theZ
anddomZ
values from the number of proteins given to the search pipeline.gecco.cli
delegates imports to make CLI more responsive.pkg_resources
has been replaced withimportlib.resources
andimportlib.metadata
where applicable.multiprocessing.cpu_count
has been replaced withos.cpu_count
where applicable.
0.6.2
0.6.1
Fixed
- Progress bar not being disabled by
-q
flag in CLI. - Fallback to using HMM name if accession is not available in
PyHMMER
. - Group genes by source contig and process them separately in
PyHMMER
to avoid bogus E-values.
Added
psutil
dependency to get the number of physical CPU cores on the host machine.- Support for using an arbitrary mapping of positives to negatives in
gecco embed
.
Removed
- Unused and outdated
HMMER
andDomainRow
classes fromgecco.hmmer
.