trulens-1.1.0
What's Changed
TruLens 1.1 has a ton of exciting changes - we've grouped the updates into the new features they support so you can jump straight to the updates you're most excited about:
- TruLens Dashboard
- Feedback Provider Support
- Search Metric Support
- Adding dataframes to TruLens
- OpenTelemetry Support
- Async and Streaming Support
- More Reliable Feedback Functions
- New Examples
- Docs Updates
- Bug Fixes
TruLens Dashboard
In TruLens 1.1, we re-imagined the dashboard with a focus on making it easy to track large numbers of experiments, make comparisons and improve your apps for production. We also made several improvements performance and usability including dark mode.
Read more about the new look dashboard.
See the changes:
- Dark mode for Trace viewer by @sfc-gh-gtokernliang in #1437
- Make styling more compatible with dark mode for feedback functions by @sfc-gh-gtokernliang in #1439
- Add missing UX components to streamlit feedback component by @sfc-gh-jreini in #1440
- Dashboard Enhancements by @sfc-gh-chu in #1443
- leaderboard list view fix by @sfc-gh-chu in #1491
- small perf improvements by @sfc-gh-chu in #1490
- fix to sql query bug in dashboard by @sfc-gh-pmardziel in #1531
- Fix leaderboard showing inconsistent latency readings by @sfc-gh-chu in #1522
Expanded Search Metric Support
TruLens now supports common information retrieval (search) metrics including IR Hit Rate, NDCG, Precision, Recall, Mean Reciprocal Rank and more. These new metrics are accessible as ground truth feedback functions and simply require the addition of expected_chunks
to your ground truth data. Try the example
See the change:
- Information retrieval (search) metrics computation with ground truth datasets - notebook + metrics implementation by @sfc-gh-dhuang in #1545
Getting started with existing data
It's now easier than ever to get started with TruLens. Starting with a dataframe with query
, response
and contexts
columns, you can load it to TruLens using add_dataframe
and easily run feedback functions against your data. Try it yourself
See the change:
add_dataframe
method + quickstart by @sfc-gh-jreini in #1474
Experimental support for Open Telemetry
We've added experimental preview support for Open Telemetry, enabled with session.experimental_enable_feature("otel_tracing")
. We are collecting feedback and will be continuing to improve the user experience for writing and reading OpenTelemetry traces. If you want to try it out, check it out with custom python or Llama-Index.
See the changes:
- OTEL import/export by @sfc-gh-pmardziel in #1485
- experimental flags by @sfc-gh-pmardziel in #1427
Restored Async and Streaming Support
- memory, threads, and async leakage testing by @sfc-gh-pmardziel in #1470
- fix async handling and other release pipeline failures by @sfc-gh-pmardziel in #1441
More reliable feedback functions
- Simplify system prompt generation conditions with output space and criteria by @sfc-gh-dhuang in #1554
- handle partial functions for feedback functions by @sfc-gh-chu in #1551
- More error handling for groundedness internal steps by @sfc-gh-jreini in #1549
- RAG triads llm as judges benchmark - adding meta-eval metrics for correlation measurement and experiment notebooks by @sfc-gh-dhuang in #1462
- Add option to filter trivial statements for groundedness measure by @sfc-gh-pdharmana in #1556
- Fix splitting key_points issue: generalize the solution for splitting key points in _assess_key_point_inclusion() by @dom7kim in #1519'
Feedback Provider Support
- Add mistral-large2 to the list of supported models in Cortex feedback provider by @sfc-gh-dhuang in #1496
- Claude 3 support for AWS Bedrock by @sfc-gh-chu in #1481
- Switch to llama 3.1 8b as default model in cortex by @sfc-gh-dhuang in #1500
- Support having a
Langchain
provider with aBaseLLM
and not justBaseChatModel
. by @sfc-gh-dkurokawa in #1459
New Examples
- Cortex Fine-tuning experiments notebook by @sfc-gh-jreini in #1453
- Cortex Chat Quickstart by @sfc-gh-jreini in #1446 and #1460
- Server side feedback computation + batch ingestion by @sfc-gh-jreini in #1464
- New Custom Streaming example by @sfc-gh-pmardziel in #1441
Docs Updates
- az badge update by @sfc-gh-chu in #1436
- docs nits by @sfc-gh-jreini in #1434
- Docs Changes by @sfc-gh-chu in #1473
- website analytics and dark mode fixes by @sfc-gh-chu in #1497
- Add blog site and docs grouping by @sfc-gh-chu in #1499
- Fix colab links by @sfc-gh-jreini in #1508
- Josh/center homepage image text + change app versions compared by @sfc-gh-jreini in #1442
- Fix homepage blog link by @sfc-gh-chu in #1535
Bug Fixes
- endpoint kwargs by @sfc-gh-chu in #1489
- Update threading.py, fix context loss in multi-threading by @glennfeys in #1478
- Fix Selector AttributeError by @sfc-gh-chu in #1553
- SQLAlchemy joinedload on record.app relationship by @sfc-gh-chu in #1524
- release pipeline related fixes by @sfc-gh-pmardziel in #1435
- fix trulens_eval migration link by @sfc-gh-pmardziel in #1448
- fix typo in Makefile by @sfc-gh-pmardziel in #1463
- Add progress bars to data migration scripts by @sfc-gh-chu in #1458
- cortex instrumentation fixes by @sfc-gh-chu in #1447
- Conda Meta Hash fix by @sfc-gh-srudenko in #1468
- fix optionals in core by @sfc-gh-pmardziel in #1471
- fix optional import message by @sfc-gh-chu in #1457
- Allow for DBs that already have tables (at least if they're not sqlite databases). by @sfc-gh-dkurokawa in #1449
- Bumping conda meta to build a conda package 1.0.2 by @sfc-gh-srudenko in #1479
- Move requests/Endpoint.post to huggingface provider by @sfc-gh-chu in #1476
- relax minor package version constraints by @sfc-gh-chu in #1482
- init_server_side=False by default by @sfc-gh-chu in #1483
- slight downgrade of minimum dep requirements by @sfc-gh-chu in #1504
- Cleanup main pyproject and relax minor versions by @sfc-gh-chu in #1494
- Updates to the query planning notebook by @sfc-gh-dhuang in #1512 and @sfc-gh-jreini in #1514
- Conda build meta changes by @sfc-gh-srudenko in #1503
- Dashboard fixes by @sfc-gh-jreini in #1518
- Bump the pip group across 1 directory with 2 updates by @dependabot in #1507
- small record ingest formatting fix by @sfc-gh-chu in #1515
- Set criteria for feedbacks correctly. by @sfc-gh-dkurokawa in #1526
- Allow using Snowflake Connector for the actual DB Connector instead of the connection parameters. by @sfc-gh-dkurokawa in #1527
- split off python compatibility utilities by @sfc-gh-pmardziel in #1528
- Endpoints as Multitons and Singletons represented as metaclasses by @sfc-gh-chu in #1523
- Fix SnowflakeConnector connection URL by @sfc-gh-chu in #1544
- Bump vite from 4.5.3 to 4.5.5 in /src/dashboard/react_components/record_viewer in the npm_and_yarn group across 1 directory by @dependabot in #1502
- Store endpoints in list with weakref.ref by @sfc-gh-chu in #1555
New Contributors
- @glennfeys made their first contribution in #1478
- @dom7kim made their first contribution in #1517
Full Changelog: trulens-1.0.11...trulens-1.1.0