Skip to content

Releases: truera/trulens

TruLens 1.3.0

10 Jan 21:50
cdb520e
Compare
Choose a tag to compare

Optimizing Feedback Functions

In this release, we add important changes for improving the alignment of their LLM-Judge evals to human evaluations.

Global Improvement of Groundedness Feedback

The first is the global improvement of the groundedness feedback function (benchmarks and methods forthcoming). We invite any users to submit feedback (positive or negative) on the effectiveness of the new groundedness function using GitHub Issues or Discussions.

You can view the addition of new groundedness criteria in the GitHub diff below.

Screenshot 2025-01-10 at 11 18 51 AM

New levers for aligning feedback functions

The second change is that we add new easy-to-use levers for you to change the behavior of feedback functions using few-shot examples and custom criteria. Early customers have seen useful benefit in aligning their feedback functions to their collected expert evaluations using these levers.

Adding custom criteria to a feedback function

custom_criteria = """
A positive sentiment should be expressed with an extremely encouraging and enthusiastic tone.
"""

provider.sentiment(
    "When you're ready to start your business, you'll be amazed at how much you can achieve!",
    criteria=custom_criteria,
)

Adding few-shot examples to guide feedback functions

from trulens.feedback.v2 import feedback

fewshot_relevance_examples_list = [
    (
        {
            "query": "What are the key considerations when starting a small business?",
            "response": "You should focus on building relationships with mentors and industry leaders. Networking can provide insights, open doors to opportunities, and help you avoid common pitfalls.",
        },
        3,
    ),
]

provider.relevance(
    "What are the key considerations when starting a small business?",
    "Find a mentor who can guide you through the early stages and help you navigate common challenges.",
    examples=fewshot_relevance_examples_list,
)

What's Changed

Bug Fixes

Preparations for Open Telemetry compatibility

Full Changelog: trulens-1.2.11...trulens-1.3.0

TruLens 1.2.11

16 Dec 22:04
5ebd0fd
Compare
Choose a tag to compare

What's Changed

Full Changelog: trulens-1.2.10...trulens-1.2.11

TruLens 1.2.10

06 Dec 01:22
5c7422c
Compare
Choose a tag to compare

What's Changed

Full Changelog: trulens-1.2.9...trulens-1.2.10

TruLens 1.2.9

04 Dec 00:12
0614871
Compare
Choose a tag to compare

What's Changed

Full Changelog: trulens-1.2.6...trulens-1.2.9

TruLens v1.2.6

06 Nov 18:36
Compare
Choose a tag to compare

What's Changed

Full Changelog: trulens-1.2.4...trulens-1.2.6

TruLens v1.2.4

05 Nov 03:31
Compare
Choose a tag to compare

What's Changed

Full Changelog: trulens-1.2.2...trulens-1.2.4

TruLens v1.2.2

30 Oct 19:28
5e74741
Compare
Choose a tag to compare

What's Changed

  • Use snowflake connector over snowpark session in trulens Snowflake DB connector as snowpark session isn't thread-safe. by @sfc-gh-dkurokawa in #1604
  • Don't open extra Snowflake connections and don't recycle connections as quickly. by @sfc-gh-dkurokawa in #1609
  • Remove unnecessary deps from trulens-connectors-snowflake. by @sfc-gh-dkurokawa in #1611

Full Changelog: trulens-1.2.1...trulens-1.2.2

TruLens v1.2.1

29 Oct 19:40
46f05d0
Compare
Choose a tag to compare

Bug Fixes

New Contributors

Full Changelog: trulens-1.2.0...trulens-1.2.1

TruLens v1.2.0

28 Oct 21:31
Compare
Choose a tag to compare

What's Changed

Bug Fixes

Examples

Full Changelog: trulens-1.1.0...trulens-1.2.0

trulens-1.1.0

10 Oct 13:13
Compare
Choose a tag to compare

What's Changed

TruLens 1.1 has a ton of exciting changes - we've grouped the updates into the new features they support so you can jump straight to the updates you're most excited about:

  • TruLens Dashboard
  • Feedback Provider Support
  • Search Metric Support
  • Adding dataframes to TruLens
  • OpenTelemetry Support
  • Async and Streaming Support
  • More Reliable Feedback Functions
  • New Examples
  • Docs Updates
  • Bug Fixes

TruLens Dashboard

In TruLens 1.1, we re-imagined the dashboard with a focus on making it easy to track large numbers of experiments, make comparisons and improve your apps for production. We also made several improvements performance and usability including dark mode.

Read more about the new look dashboard.

See the changes:

Expanded Search Metric Support

TruLens now supports common information retrieval (search) metrics including IR Hit Rate, NDCG, Precision, Recall, Mean Reciprocal Rank and more. These new metrics are accessible as ground truth feedback functions and simply require the addition of expected_chunks to your ground truth data. Try the example

See the change:

  • Information retrieval (search) metrics computation with ground truth datasets - notebook + metrics implementation by @sfc-gh-dhuang in #1545

Getting started with existing data

It's now easier than ever to get started with TruLens. Starting with a dataframe with query, response and contexts columns, you can load it to TruLens using add_dataframe and easily run feedback functions against your data. Try it yourself

See the change:

Experimental support for Open Telemetry

We've added experimental preview support for Open Telemetry, enabled with session.experimental_enable_feature("otel_tracing") . We are collecting feedback and will be continuing to improve the user experience for writing and reading OpenTelemetry traces. If you want to try it out, check it out with custom python or Llama-Index.

See the changes:

Restored Async and Streaming Support

More reliable feedback functions

  • Simplify system prompt generation conditions with output space and criteria by @sfc-gh-dhuang in #1554
  • handle partial functions for feedback functions by @sfc-gh-chu in #1551
  • More error handling for groundedness internal steps by @sfc-gh-jreini in #1549
  • RAG triads llm as judges benchmark - adding meta-eval metrics for correlation measurement and experiment notebooks by @sfc-gh-dhuang in #1462
  • Add option to filter trivial statements for groundedness measure by @sfc-gh-pdharmana in #1556
  • Fix splitting key_points issue: generalize the solution for splitting key points in _assess_key_point_inclusion() by @dom7kim in #1519'

Feedback Provider Support

New Examples

Docs Updates

Bug Fixes

Read more