Releases: truera/trulens
Releases · truera/trulens
TruLens Eval v0.13.0
Library containing evaluations of LLM Applications
Changelog
- Updated all documentation to show context recorder usage
- Smoke Tests are tested with trulens eval
Examples
- Examples are restructured for better discoverability
- Added a Milvus Vector DB Example
Bug Fixes
- Removed metadata_fn in examples
TruLens Eval v0.12.0
Library containing evaluations of LLM Applications
Changelog
- Added chain of thought and reason metadata to LLM based feedback functions
- Feedback function docs upgrade
- Feedback Function APIs now showing actual APIs with code
- App wrappers (TruChain/TruLLama/etc) docs with code
- More concise selector documentation with code
Examples
- Updated examples to use context recording
Bug Fixes
- Fix for basic app with multiple args
- Fix aggregation bug in multi context groundedness introduced in 0.11.0
- Now shows index of json path if available in timeline UI
- No longer overwrites user changes to streamlit .toml files
- Slow or hanging thread bug fix
TruLens Eval v0.11.0
Changelog
- Add ability to add metadata to records
- Add Feedback functions for bertscore, rouge, and bleu scores
- More instrumentation for Langchain Agents
- Added capability to instrument more than the default calls such as LangchainP Prompt Templates
- Added support for tracking via python context managers
- Added badges showing test results on documentation page
Examples
- Added Llama Index RAG application with a vector store using Milvus
Bug Fixes
- Fix for multi-result introduced in 0.10.0
- Allow FeedbackCall to have JSON args
- Fix error for OpenAi Chat LLM with ChatPromptTemplate
TruLens Eval v0.10.0
Library containing evaluations of LLM Applications
Changelog:
- Allow connecting to remote database via SQLAlchemy
Bugfixes:
- Patch instrumentation losing track of instrumented objects (Fixes #373)
TruLens Eval v0.9.0
Library containing evaluations of LLM Applications
Changelog
- Allow custom feedback function naming
- Allow multi output from feedback functions via key, value output
- Display method name alongside class name in timeline view
- Support for calls to langchain evaluation criteria
- Allow instrumentation tracking of user defined classes for LLM apps
Examples
- Added Langchain Agents example
Bug Fixes
- Fix For get_records_and_feedback when supplying app_ids
- Silence key not supplied errors on all API calls
- Fix check on AzureOpenAI provider for groundedness feedback function
TruLens Eval v0.8.0
Library containing evaluations of LLM Applications
Changelog
- Support for asynch calls for both langchain (acall) and llama-index (aquery)
- Support for streaming and chat for llama-index (chat, achat, stream_chat, astream_chat)
- Support for user subclassed components for both langchain and llama index
- Support for specifying a db file via Tru(database_file=“dbfile”)
Examples
- Added a FAISS example
- Examples added for asynch
- Fix reference to correct literature - Alice in Wonderland in llama-index subquestion example
- Add groundedness to pinecone and agents examples
- Add colab links to examples
Bug Fixes
- Fix AzureOpenAI to take deployment_id
- Fix code bugs
TruLens Eval v0.7.0
Library containing evaluations of LLM Applications
Changeling
- Added Groundedness Feedback functions to verify supporting evidence using Huggingface NLI and OpenAI LLMs
- Updated UI Timeline view to include application component input and output details on click
- Updated UI leaderboard to show application metadata
Documentation
- Added Prompt/Response and Question/Statement Performance to API documentation for visibility into the feedback function working on real data
- Added TruBasicApp to API documentation
Examples
- Updated llama-index agent example with more evaluations
Bug Fixes
- Removed key requirements from some UI components introduced in 0.6.0
TruLens Eval v0.6.0
Library containing evaluations of LLM Applications
Changelog
- Added a feedback function (and notebook example) that checks against provided ground truths
- Added a visibility framework into feedback function operations (what data is being used by the function, etc)
- More feedback functions will add useful information in the future
- Improved QS Relevance function and added human validated quality checks which will be added to documentation in the future
- Added warning levels (yellow) for feedback functions
Examples
- Added example interfacing with Pinecone DB
- Added example using llama-index agents
Dependency Upgrades
- Upgraded llama-index to version 0.7+
Bug Fixes
- Add error handling around non-importable functions for deferred evaluation
TruLens Explain
Library containing attribution and interpretation methods for deep nets
TruLens Eval
Library containing evaluations of LLM Applications
Changelog
- Upgraded the Record Viewer to include Framework call stack, call timing, and merged the call parameters into the call definition dropdown.
- Added a simple CLI on installation to start the dashboard with a single
trulens-eval
call - Added support for non-framework LLMs such as direct OpenAI LLM API calls or any generic "text to text" applications
- Added new examples including the evaluation of different configurations with Pinecone vector database
- Added support for Colab notebooks and added linked examples to the README
- Added ability to add tags to Apps
- Added usage documentation for Feedback Function input selection
- Bugfixes for better error handling from Feedback Functions