Releases · truera/trulens

22 Sep 03:08

rshih32

trulens-eval-0.13.0

10f2bae

TruLens Eval v0.13.0

Library containing evaluations of LLM Applications

Changelog

Updated all documentation to show context recorder usage
Smoke Tests are tested with trulens eval

Examples

Examples are restructured for better discoverability
Added a Milvus Vector DB Example

Bug Fixes

Removed metadata_fn in examples

Assets 2

08 Sep 00:11

rshih32

trulens-eval-0.12.0

a08438f

TruLens Eval v0.12.0

Library containing evaluations of LLM Applications

Changelog

Added chain of thought and reason metadata to LLM based feedback functions
Feedback function docs upgrade
- Feedback Function APIs now showing actual APIs with code
- App wrappers (TruChain/TruLLama/etc) docs with code
- More concise selector documentation with code

Examples

Updated examples to use context recording

Bug Fixes

Fix for basic app with multiple args
Fix aggregation bug in multi context groundedness introduced in 0.11.0
Now shows index of json path if available in timeline UI
No longer overwrites user changes to streamlit .toml files
Slow or hanging thread bug fix

Assets 2

31 Aug 23:19

rshih32

trulens-eval-0.11.0

8557b52

TruLens Eval v0.11.0

Changelog

Add ability to add metadata to records
Add Feedback functions for bertscore, rouge, and bleu scores
More instrumentation for Langchain Agents
Added capability to instrument more than the default calls such as LangchainP Prompt Templates
Added support for tracking via python context managers
Added badges showing test results on documentation page

Examples

Added Llama Index RAG application with a vector store using Milvus

Bug Fixes

Fix for multi-result introduced in 0.10.0
Allow FeedbackCall to have JSON args
Fix error for OpenAi Chat LLM with ChatPromptTemplate

Assets 2

18 Aug 02:05

shayaks

trulens-eval-0.10.0

3cfced3

TruLens Eval v0.10.0

Library containing evaluations of LLM Applications

Changelog:

Allow connecting to remote database via SQLAlchemy

Bugfixes:

Patch instrumentation losing track of instrumented objects (Fixes #373)

Assets 2

10 Aug 22:39

rshih32

trulens-eval-0.9.0

9ddf355

TruLens Eval v0.9.0

Library containing evaluations of LLM Applications

Changelog

Allow custom feedback function naming
Allow multi output from feedback functions via key, value output
Display method name alongside class name in timeline view
Support for calls to langchain evaluation criteria
Allow instrumentation tracking of user defined classes for LLM apps

Examples

Added Langchain Agents example

Bug Fixes

Fix For get_records_and_feedback when supplying app_ids
Silence key not supplied errors on all API calls
Fix check on AzureOpenAI provider for groundedness feedback function

Assets 2

03 Aug 20:41

rshih32

trulens-eval-0.8.0

9d476d6

TruLens Eval v0.8.0

Library containing evaluations of LLM Applications

Changelog

Support for asynch calls for both langchain (acall) and llama-index (aquery)
Support for streaming and chat for llama-index (chat, achat, stream_chat, astream_chat)
Support for user subclassed components for both langchain and llama index
Support for specifying a db file via Tru(database_file=“dbfile”)

Examples

Added a FAISS example
Examples added for asynch
Fix reference to correct literature - Alice in Wonderland in llama-index subquestion example
Add groundedness to pinecone and agents examples
Add colab links to examples

Bug Fixes

Fix AzureOpenAI to take deployment_id
Fix code bugs

Assets 2

02 Aug 18:24

rshih32

trulens-eval-0.7.0

bac458e

TruLens Eval v0.7.0

Library containing evaluations of LLM Applications

Changeling

Added Groundedness Feedback functions to verify supporting evidence using Huggingface NLI and OpenAI LLMs
Updated UI Timeline view to include application component input and output details on click
Updated UI leaderboard to show application metadata

Documentation

Added Prompt/Response and Question/Statement Performance to API documentation for visibility into the feedback function working on real data
- https://www.trulens.org/trulens_eval/pr_relevance_smoke_tests/
- https://www.trulens.org/trulens_eval/qs_relevance_smoke_tests/
Added TruBasicApp to API documentation

Examples

Updated llama-index agent example with more evaluations

Bug Fixes

Removed key requirements from some UI components introduced in 0.6.0

Assets 2

24 Jul 14:03

rshih32

trulens-eval-0.6.0

a3191d3

TruLens Eval v0.6.0

Library containing evaluations of LLM Applications

Changelog

Added a feedback function (and notebook example) that checks against provided ground truths
Added a visibility framework into feedback function operations (what data is being used by the function, etc)
- More feedback functions will add useful information in the future
Improved QS Relevance function and added human validated quality checks which will be added to documentation in the future
Added warning levels (yellow) for feedback functions

Examples

Added example interfacing with Pinecone DB
Added example using llama-index agents

Dependency Upgrades

Upgraded llama-index to version 0.7+

Bug Fixes

Add error handling around non-importable functions for deferred evaluation

Assets 2

19 Jul 20:28

coreyhu

trulens-explain-0.13.4

26cf811

TruLens Explain

Library containing attribution and interpretation methods for deep nets

Assets 2

13 Jul 14:05

rshih32

trulens-eval-0.5.0

d58c7d8

TruLens Eval

Library containing evaluations of LLM Applications

Changelog

Upgraded the Record Viewer to include Framework call stack, call timing, and merged the call parameters into the call definition dropdown.
Added a simple CLI on installation to start the dashboard with a single trulens-eval call
Added support for non-framework LLMs such as direct OpenAI LLM API calls or any generic "text to text" applications
Added new examples including the evaluation of different configurations with Pinecone vector database
Added support for Colab notebooks and added linked examples to the README
Added ability to add tags to Apps
Added usage documentation for Feedback Function input selection
Bugfixes for better error handling from Feedback Functions

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Releases: truera/trulens

TruLens Eval v0.13.0

TruLens Eval v0.12.0

TruLens Eval v0.11.0

TruLens Eval v0.10.0

TruLens Eval v0.9.0

TruLens Eval v0.8.0

TruLens Eval v0.7.0

TruLens Eval v0.6.0

TruLens Explain

TruLens Eval