Skip to content

Releases: MatthewRalston/kmerdb

v0.8.9

11 Oct 01:03
Compare
Choose a tag to compare

New features

  • average k-mer coverage, k-mer histogram
  • strassen mulitplication, numpy delegations
  • batched strassen
  • regression and least-squares analysis

bugfixes and quality of life improvements.

v0.8.5

17 Jul 20:55
Compare
Choose a tag to compare

Hotfix to the 0.8.4 release, which was said to be 'acceptance tested'. And then came the profile UI rework during the actual 0.8.4-0.8.5 interlude, and the 0.8.4 commit that was chosen needed to be yanked, and then UI fix is replaced here with hotfix. Currently writes correct counts (not zeros, see diff) to the file and not zeroes I'm glad I noticed, just a few days after the 0.8.4 patch.

This release includes the recent features:

  • graph subcommand
  • usage|help subcommands
  • --debug for default error handling
  • -o|--output-name revised usage patterns
  • <samplesheet.txt|input_1.fa|input2_.fq.gz> samplesheet pattern.
  • --quiet throughout verbose commands

v0.8.2

14 May 00:27
Compare
Choose a tag to compare

Acceptance tested version v0.8. New features (vs 0.7.6+) include

  • exit_summary
  • graph subcommand
  • usage/help subcommand
  • Deprecations
  • Improved logging
  • Logfile
  • --num-log-lines to display last X lines of log in the exit_summary error datastructure, upon raising an exception
  • Command banner
  • Fixed citation subcommand.

as well as a number of bugfixes

Notes

Acceptance tested 'stable' v0.8 release. Some regressions may remain silent in usage without the --debug feature, which skips a feature used to condense metadata, and help the developer collect and assess bugs with relevant feature/step metadata available, as well as traceback, feature description etc error information available at program exit. The structure is YAML/JSON.

The data structure schema is located at config.exit_summary_schema.

v0.8.0

12 Apr 23:33
1c87ac4
Compare
Choose a tag to compare

--debug flag introduced to skip error/exit handling module. In brief, errors are caught and processed differently under --debug modes. Convenience features have been added to the standard invocation to provide an "exit summary", describing (clearly) the 'last loggable line', abbreviates the logging to stderr, captures traceback objects, errors, program "steps" and "features" (described in config.py) where the program failed, and other relevant metadata.

Use --debug if the program exits early or with no logged information.

kmerdb usage -m method introduced

Revisions tested across the board.

profile, graph, usage, matrix, distance, kmeans, and hierarchical have been acceptance tested.

0.7.9

06 Apr 05:27
c7db249
Compare
Choose a tag to compare
0.7.9 Pre-release
Pre-release

This release include the new graph command. Some homemaking/housekeeping. Some bona-fide regressions handled.

Maintenance release. New interface otw.

v0.7.8

28 Mar 19:39
84a33ea
Compare
Choose a tag to compare
v0.7.8 Pre-release
Pre-release

kmerdb graph introduced, producing a new file form .kdbg, an edge list. New metadata schema for new format as well. kmerdb view and kmerdb header are compatible with new format.

The goal is to create an weighted graph. Support for assembly and graph visualizations in the future.

After 0.7.6 the .kdb spec will be loosely deprecated. While the .kdb format may remain unchanged (don't know yet), the goal is to produce an adjacency list structure from only the k-mer counts and the 'neighbor' k-mer ids. After the format revision (mostly to the --all-metadata option), a new command kmerdb graph will be applied to generate a on-disk representation of an adjacency list.

  • What does this mean?

At this point, the new feature is in the planning stage, and it is not known if backwards compatibility (< 0.7.7) will be supported. One goal is to create an adjacency list structure on disk from the --all-metadata augmented .kdb format. It is not clear yet if cycles will be permitted in the graph structure, or if a distinct "offset" flag will be used. An example follows.

  • 0.7.6 .kdb format
    col1 is row number, col2 is sort order, col3 is k-mer id, col4 is k-mer count, col5 (--all-metadata) featured a loosely specified 'neighbor' JSON field, consisting of a dictionary with "A", "C", "T" "G" etc. keys and it was poorly implemented. Basically, the neighboring (left side and right side) k-mer ids were provided.
1    1    1    123
  • 0.7.7+ .kdbg
    col1 is unique row number, col2 is k-mer id (may be repeated), col3 is a .csv field of possible adjacent row-ids, corresponding to the k-mer id's (col2) neighbors in kmer-space. col4 represents a possible solution for the graph traversal that produces a Hamiltonian (whatever) walk through the graph recapitulating either the exact (.fasta) assembly solution OR a potential solution to the assembly from available data and a feasible solution either using networkx or somehow a custom graph traversal algorithm that minimized the penalty of omitting rows/k-mers based on the suggestion of the shortest path to visit each k-mer once but that also? maximizes the number of rows visited? I'm not sure yet how this will be specifically implemented, as the .kdbg format is the first step.
1    1234    2345,3456,...    3

v0.7.4

03 Jan 12:13
Compare
Choose a tag to compare

Build system migration completed. Closes #109. Completes triage of the hotfix rebase. Old master abbrogated.

v0.7.1alpha

28 Dec 22:18
Compare
Choose a tag to compare
v0.7.1alpha Pre-release
Pre-release

Migrating to a new wheel generation process using python -m build. pyproject.toml Generates a working installed module kmerdb.

This can be confirmed by running

#  $ python
>>>import kmerdb

Similarly, direct module invocation works fine, this minor release is bugged because of the pyproject.toml.

python -m kmerdb -h

v0.7.0

14 Jul 00:43
1a885da
Compare
Choose a tag to compare

Adds parallelization support to the Cythonized distance function 'pearson'

kmerdb distance -p 10 pearson inputs.tsv

v0.6.9

29 Jun 02:53
Compare
Choose a tag to compare

Minor bug fixes, patch to the Cython extension to use an 80-bit floating point accumulator.