Releases: will-rowe/hulk
Releases · will-rowe/hulk
v1.0.0
This is a complete re-implementation of HULK. The change log states the main changes:
- fully re-written codebase
- I've aimed for it to be largely backwards compatible with previous releases
- fully open-sourced!
- MIT license (OSI approved)
- algorithm changes
- underlying histogram is now based on minimizer frequencies
- count-min sketch for k-mer frequencies is now replaced with a fixed-size array and a jump-hash for minimizer placement
- changes to the
sketch
subcommand:- sketches saved to JSON by default (ala sourmash)
- histosketch count-min sketch is no longer configurable by the user (this was Epsilon and Delta)
- spectrum size is determined based on k-mer size
- minCount for k-mer frequencies is removed
- changes to the
smash
subcommand:- operates on JSON input
- outputs matrix as csv
- replaced some unecessary features
- the functionality of the
print
anddistance
subcommands is available in thesmash
subcommand
- the functionality of the
v0.1.2
Minor bug fixes and improvements:
- adding buffered channels for read processing and hashing
- swapping countmin sketch parameters (back to epsilon and delta to offer more tuning)
- tweaking jump hash for counter placement in cms
- updating default settings to improve performance after the above changes
v0.1.0
This release bumps HULK to a more stable version. Here are a summary of the main changes:
-
swap uint64 encoding of k-mers to instead us ntHash (Go implementation)
-
replace delta+epsilon values in CMS with a soft memory limit for the CMS structure
-
use Jump hash adjusting/querying CMS counters
- also change the XORing of hash function so that the uint64 is split to 2 uint32s, with one of these being altered using the CMS depth iterator.
-
allow FASTA input
-
bug fixes (histosketch metadata, weighted jaccard similarity