Skip to content

v0.1.1

Compare
Choose a tag to compare
@nsthorat nsthorat released this 26 Sep 00:08
· 398 commits to main since this release

Overview

  • Embedding computation can now be larger-than-RAM! Computing lots of embeddings will iteratively write to a vector store.
  • JSON and CSV sources are heavily optimized and go through duckdb for parsing.
  • Clustering now supports semantic clustering with embeddings, using DBScan.

New features

  • Add SQLite source and optimize the JSON and CSV sources by @dsmilkov in #710
  • Add a dict source and convert LangSmith source to use it by @dsmilkov in #716
  • Add clustering signal by @dsmilkov in #711

Performance

  • Use iterables for compute_signal and compute_embedding. by @nsthorat in #706
  • Write embeddings to the vector store iteratively by @nsthorat in #709
  • Add SQLite source and optimize the JSON and CSV sources by @dsmilkov in #710
  • Speed up the docker image build step by installing lilac from pip before installing the local wheel. by @nsthorat in #714
  • Improve perf of server by removing UUID sort by @dsmilkov in #715

Bug fixes

  • Fix semantic search on repeated by @dsmilkov in #704
  • Fix syntax error with keyword search by @dsmilkov in #705
  • Fix bug with span highlighting a repeated field by @nsthorat in #713
  • Change the bootup load to be during the new FastAPI lifecycle API. by @nsthorat in #717

Full Changelog: v0.1.0...v0.1.1