Releases: databricks/lilac
Releases · databricks/lilac
v0.1.11
What's Changed
- Swap in new embedding-chunked splitter algorithm by @brilee in #791
- Add the language model reference concept to Lilac. by @nsthorat in #797
- Improve ergonomics of map(). by @nsthorat in #795
Other Changes
- Fix a flaky test by @dsmilkov in #792
- Add pdb-compatible script and documentation by @brilee in #794
- Switch splitter to new algorithm, fixing last blocking bug. by @brilee in #796
- Add more examples to the language model reference. by @nsthorat in #798
Full Changelog: v0.1.10...v0.1.11
v0.1.10
What's Changed
- Add the simplest dataset.map() by @nsthorat in #769
- Add sampling to our
ParquetSource
by @dsmilkov in #773 - Add intelligent sampling in
ParquetSource
by @dsmilkov in #778 - Add
include_labels
andexclude_labels
when exporting data viadataset.to_*
by @dsmilkov in #768 - Improve the Export Dataset modal dialog by @dsmilkov in #775
- Allow searching by pre-computed concepts in the searchbox. by @nsthorat in #783
- Add spacy embedding-clustering splitter by @brilee in #784
Other Changes
- Add notebook exploring chunking algorithms by @brilee in #771
- Remove TextSplitterSignal, fixing related tests. by @brilee in #774
- Fix the concept labeler with the new preview concept key. by @nsthorat in #780
- Improve export preview text. by @nsthorat in #782
- Upgrade ruff version and fix associated new linter errors by @brilee in #777
New Contributors
Full Changelog: v0.1.9...v0.1.10
v0.1.9
v0.1.8
v0.1.7
v0.1.6
Features
- Make labels toggle-able by @dsmilkov in #748
- Add HDBScan with UMAP by @nsthorat in #749
- Add lilac docker image and use a lighter 5x image for HF by @dsmilkov in #750
Bug fixes / Other
- Fix duplicate stats requests by @dsmilkov in #754
- Add CLI prompts for token in deploy script by @nsthorat in #755
- Add docker deploy instructions in
dev.md
and multi-platform build by @dsmilkov in #753
Full Changelog: v0.1.15...v0.1.6
v0.1.15
Features
- Improve markdown rendering by @dsmilkov in #728
- Add "Group by" mode by @dsmilkov in #735
- Add LILAC_AUTH_ADMIN_EMAILS to set admins. by @nsthorat in #733
- Add LILAC_AUTH_USER_DISABLE_LABEL_ALL by @nsthorat in #741
- Add allowing non-admin users to add labels, but not create label types. by @nsthorat in #734
- Add a github source and a LlamaIndexDocsSource source. by @nsthorat in #740
Bug fixes / other changes
- Fix bug with dataset config on upload. by @nsthorat in #729
- Upgrade duckdb to 0.9 by @dsmilkov in #731
- First prototype of RAG UI. by @nsthorat in #732
- Extract the dropdown+pill into a
DropPill
component by @dsmilkov in #742 - Make llama-index an optional dependency. by @nsthorat in #744
- Bug fixes for deploy scripts. by @nsthorat in #745
Full Changelog: v0.1.4...v0.1.15
v0.1.4
v0.1.3
You can now deploy Lilac to a HuggingFace dataset with just a few lines of Python, or from the CLI:
Deploy a single configuration object, and have it load entirely on the space:
ll.deploy_config(
hf_space='nsthorat-lilac/nikhil-demo',
# Create the space if it doesn't exist.
create_space=True,
config=ll.Config(datasets=[
ll.DatasetConfig(
namespace='local',
name='glue_ax',
source=ll.HuggingFaceSource(dataset_name='glue', config_name='ax'))
]))
Deploy a Lilac project you've loaded locally:
ll.deploy_project(
hf_space='nsthorat-lilac/nikhil-project-demo',
project_dir='./data',
datasets=['local/glue_ax'], # This is optional. If not defined, uploads all datasets.
# Create the space if it doesn't exist.
create_space=True)
Or via the CLI:
lilac deploy-project --project_dir='~/my_project'
Features
Other Changes
Demo
- Add the textbook quality programming dataset to the demo. Clean up old datasets. by @nsthorat in #723
Full Changelog: v0.1.2...v0.1.3