Releases: MinishLab/model2vec
Releases · MinishLab/model2vec
v0.3.3
What's Changed
- feat: Added onnx and tokenizer files support script by @Pringled in #119
- docs: Update readme by @Pringled in #122
- fix: Fixed CI by @Pringled in #124
- docs: Updated results table by @Pringled in #125
- docs: Updated slogan by @Pringled in #127
- fix: Added jinja2 requirement by @Pringled in #128
- Bumped version by @Pringled in #129
Full Changelog: v0.3.2...v0.3.3
v0.3.2
v0.3.1
What's Changed
- fix: update added tokens to be more agnostic by @stephantul in #107
- fix: don't rely on reported vocab size, log warning if inconsistent by @stephantul in #109
- docs: Fixed broken links by @Pringled in #112
- feat: make encode_batch_fast optional by @stephantul in #113
- fix: normalize would lead to NaN for empty docs by @stephantul in #114
- docs: Add tokenlearn results by @Pringled in #116
- docs: Updated plot by @Pringled in #117
- Bump version by @Pringled in #118
Full Changelog: v0.3.0...v0.3.1
v0.3.0
What's Changed
- fix: Fix token type ids not supported by @Pringled in #77
- docs: Add deduplication tutorial by @Pringled in #72
- Fix distill model bos and eos token by @zechengz in #78
- docs: Added Sentence Transformers example code by @Pringled in #80
- docs: Update readme by @Pringled in #81
- docs: Move results and add blogpost by @Pringled in #82
- docs: Fixed broken link by @Pringled in #84
- fix: move tensor to cpu by @stephantul in #86
- feat: Numpy inference by @stephantul in #87
- feat: local loading by @stephantul in #88
- feat: faster tokenization by @stephantul in #89
- enhancement: Add dynamic version by @stephantul in #91
- enhancement: Add explained variance messages by @stephantul in #92
- docs: Updated slogan by @Pringled in #94
- feat: Add python3.9 support by @Pringled in #96
- enh: remove CLI command by @stephantul in #98
- fix: rename show progress bar argument by @stephantul in #99
- fix: Reverted eos bos change by @Pringled in #101
- docs: Added results link by @Pringled in #102
- docs: Fix broken link by @Pringled in #103
- increment version by @stephantul in #104
New Contributors
Full Changelog: v0.2.4...v0.3.0
v0.2.4
What's Changed
- Add support for huggingface_hub>=0.25.0 by @tomaarsen in #73
- Bump version by @Pringled in #74
Full Changelog: v0.2.3...v0.2.4
v0.2.3
What's Changed
- fix: Updated config by @Pringled in #63
- docs: Added BPEmb results by @Pringled in #64
- Add token encode function by @stephantul in #65
- fix: bug with devices not being managed properly by @stephantul in #66
- fix: make config optional by @stephantul in #67
- enh: add dim property by @stephantul in #68
- docs: add token embedding description to README by @stephantul in #69
- fix: issue with model info missing for local model by @stephantul in #70
- Bump version by @Pringled in #71
Full Changelog: v0.2.2...v0.2.3
v0.2.2
What's Changed
- docs: Update results by @Pringled in #51
- fix: convert bfloat16 to float because of numpy incompatibility by @stephantul in #53
- fix: Add explicit errors for BPE and unigram, return tokenizer without cha… by @stephantul in #54
- Add from model by @stephantul in #57
- fix: Attention mask being None crashes distillation by @stephantul in #58
- fix: allow PCA dims == dims by @stephantul in #59
- enh: Add device selection mechanism by @stephantul in #60
- Fix in README by @stephantul in #61
- Bumped version by @Pringled in #62
Full Changelog: v0.2.1...v0.2.2
v0.2.1
What's Changed
- Bump version to 0.2.0 by @stephantul in #46
- Add median token length as limit by @stephantul in #47
- docs: Add codecov badge by @Pringled in #49
Full Changelog: v0.2.0...v0.2.1
v0.2.0
What's Changed
- remove broken dim property by @stephantul in #26
- Fix issue with hasattr vs getattr by @stephantul in #27
- Add retrieval tutorial by @Pringled in #28
- docs: Add readme headline by @Pringled in #29
- [
fix
] Various fixes for non-Posix machines by @tomaarsen in #30 - Add model cards by @Pringled in #31
- Hybrid tokenizers by @stephantul in #25
- docs: Added new image by @Pringled in #32
- [
enh
] Rely on Jinja for model card, use model id/path in snippet by @tomaarsen in #37 - remove sentence transformers dependency by @stephantul in #35
- Fix bug where we accidentally pass a pretrainedtokenizer by @stephantul in #40
- Add support for safetensors by @stephantul in #36
- Make token optional and private an argument, add template by @stephantul in #39
- tests: Add distill tests and CI by @Pringled in #42
- docs: Added codecov badge by @Pringled in #43
- Logos by @stephantul in #44
- Add model card loading by @stephantul in #45
New Contributors
- @tomaarsen made their first contribution in #30
Full Changelog: v0.1.2...v0.2.0