Skip to content

Commit

Permalink
Updated for release ver 0.2.0
Browse files Browse the repository at this point in the history
  • Loading branch information
adithya-aiplanet committed May 9, 2024
2 parents 6095c09 + f93bb42 commit 5c60843
Show file tree
Hide file tree
Showing 34 changed files with 2,318 additions and 49 deletions.
6 changes: 1 addition & 5 deletions .github/workflows/pr_workflow.yml
Original file line number Diff line number Diff line change
Expand Up @@ -18,11 +18,7 @@ jobs:
run: |
python -m pip install --upgrade pip
pip install -r requirements.txt
- name: Lint with flake8
run: |
pip install flake8
flake8 --ignore=E402,F401 .
- name: Test with pytest
run: |
pip install pytest
pytest
pytest
31 changes: 31 additions & 0 deletions cookbook/finetuning_embedding_model.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
from beyondllm import source, retrieve, llms
from beyondllm.embeddings import FineTuneEmbeddings
import os


# Setting up an environment variable for API key
os.environ['GOOGLE_API_KEY'] = "your-api-key"

# Importing and preparing the data
data = source.fit("build-career-in-ai.pdf", dtype="pdf", chunk_size=1024, chunk_overlap=0)

# List of files to train the embeddings
list_of_files = ['build-career-in-ai.pdf']

# Initializing a Gemini LLM model
llm = llms.GeminiModel()

# Creating an instance of FineTuneEmbeddings
fine_tuned_model = FineTuneEmbeddings()

# Training the embedding model
embed_model = fine_tuned_model.train(list_of_files, "BAAI/bge-small-en-v1.5", llm, "fintune")

# Option to load an already fine-tuned model
# embed_model = fine_tuned_model.load_model("fintune")

# Creating a retriever using the fine-tuned embeddings
retriever = retrieve.auto_retriever(data, embed_model, type="normal", top_k=4)

# Retrieving information using a query
print(retriever.retrieve("How to excel in AI?"))
Binary file added docs/.gitbook/assets/Thumbnails.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
40 changes: 40 additions & 0 deletions docs/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
# 📄 Overview

We at AI Planet are excited to introduce [BeyondLLM](https://github.com/aiplanethub/beyondllm), an open-source framework designed to streamline the development of RAG and LLM applications, complete with evaluations, all in just 5-7 lines of code. 

Yes, you read that correctly. Only 5-7 lines of code. 

Let's understand what and why one needs BeyondLLM.

<figure><img src=".gitbook/assets/Thumbnails.png" alt=""><figcaption><p>Build-Experiment-Evaluate-Repeat</p></figcaption></figure>

### Why BeyondLLM?

#### Easily build RAG and Evals in 5 lines of code

* Building a robust RAG (Retrieval-Augmented Generation) system involves integrating `various components` and managing associated `hyperparameters`. BeyondLLM offers an optimal framework for `quickly experimenting with RAG applications`.&#x20;
* With components like `source` and `auto_retriever`, which support several parameters, most of the integration work is automated, eliminating the need for manual coding.&#x20;
* Additionally, we are actively working on enhancing features such as hyperparameter tuning for RAG applications, addressing the next key aspect of our development roadmap.

#### Customised Evaluation Support

* The evaluation of RAG in the market largely relies on the OpenAI API Key and closed-source LLMs. However, with BeyondLLM, you have the flexibility to select any LLM for evaluating both LLMs and embeddings.&#x20;
* We offer support for `2 evaluation metrics` for embeddings: `Hit rate` and `MRR (Mean Reciprocal Rank)`, allowing users to choose the most suitable model based on their specific needs.
* Additionally, we provide `4 evaluation metrics` for assessing `Large Language Models` across various criteria, in line with current research standards.

#### Various Custom LLMs support tailoring the basic needs

* HuggingFace: Easily accessible for everyone to access Open Source LLMs
* Ollama: Run LLMs locally
* Gemini: (default LLM): Run Multimodal applications
* OpenAI: Powerful chat model LLM with best quality response
* Azure: For 32K large context good response quality support.

#### Reduce LLM Hallucination&#x20;

* Certainly, the primary objective is to minimize or eliminate hallucinations within the RAG framework.&#x20;
* To support this goal, we've developed the `Advanced RAG section`, facilitating rapid experimentation for constructing RAG pipelines with reduced hallucination risks.&#x20;
* BeyondLLM features, including source and auto\_retriever, incorporate functionalities such as `Markdown splitter`, `chunking strategies`, `Re-ranking (Cross encoders and flag embedding)` and `Hybrid Search`, enhancing the reliability of RAG applications.&#x20;
* It's worth noting Andrej Karpathy's insight: "[Hallucination is a LLM's greatest feature and not a bug](http://twitter.com/karpathy/status/1733299213503787018)," underscoring the inherent capabilities of language models.

Done talking, lets build.&#x20;
42 changes: 42 additions & 0 deletions docs/SUMMARY.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
# Table of contents

## Getting started

* [📄 Overview](README.md)
* [🔧 Installation](getting-started/installation.md)
* [🚀 Quickstart Guide](getting-started/quickstart-guide.md)

## Core Components

* [🌐 Source](core-components/source.md)
* [🧬 Embeddings](core-components/embeddings.md)
* [🤖 Auto Retriever](core-components/auto-retriever/README.md)
* [🔫 Evaluate retriever](core-components/auto-retriever/evaluate-retriever.md)
* [💼 Vector Store](core-components/vector-store.md)
* [🧠 LLMs](core-components/llms.md)
* [🔋 Generator](core-components/generator.md)
* [📊 Evaluation](core-components/evaluation.md)

## Advanced RAG&#x20;

* [📚 Re-ranker Retrievers](advanced-rag/re-ranker-retrievers.md)
* [🔀 Hybrid Retrievers](advanced-rag/hybrid-retrievers.md)
* [🥶 Finetune Embeddings](advanced-rag/finetune-embeddings.md)

## Use Cases

* [💬 Chat with PowerPoint Presentation](use-cases/chat-with-powerpoint-presentation.md)
* [🔍 Document Search and Chat](use-cases/document-search-and-chat.md)
* [🤖 Customer Service Bot](use-cases/customer-service-bot.md)
* [🗣️ Multilingual RAG](use-cases/multilingual-rag.md)

## How to Guides

* [➕ How to add new LLM?](how-to-guides/how-to-add-new-llm.md)
* [➕ How to add new Embeddings?](how-to-guides/how-to-add-new-embeddings.md)
* [➕ How to add a new Loader?](how-to-guides/how-to-add-a-new-loader.md)

## Community Spotlight

* [🔄 Share your work](community-spotlight/share-your-work.md)
* [👏 Acknowledgements](community-spotlight/acknowledgements.md)
65 changes: 65 additions & 0 deletions docs/advanced-rag/finetune-embeddings.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
# 🥶 Finetune Embeddings

Beyondllm lets you fine-tune embedding models on your own data to achieve more accurate and better results. \
\
You can fine-tune any model available on the [Hugging Face](https://huggingface.co/)&#x20;

### **Step 1 : Importing Modules**

You need an LLM to generate QA pairs for fine-tuning and FineTuneEmbeddings module to fine-tune the model.

```
from beyondllm.llms import GeminiModel
from beyondllm.embeddings import FineTuneEmbeddings
# Initializing llm
llm = llms.GeminiModel()
# calling the finetuning engine
fine_tuned_model = FineTuneEmbeddings()
```

### **Step 2 : Data to FineTune**

You need data to fine-tune your model, It could be 1 or more files so you need to make a list of all the files you want to train your model on.

```
list_of_files = ['your-file-here-1', 'your-file-here-2']
```

### **Step 3 : Training the Model**

Once everything is ready you start training by using the `train` function in FineTuneEmbeddings. &#x20;

**Parameters:**

* **Files :** The list of files you want to train your model on.
* **Model name :** The model you want to fine-tune.&#x20;
* **LLM :** Language model to generate the dataset for fine-tuning.&#x20;
* **Output path :** The path where your embedding model will be saved.&#x20;

```
# Training the embedding model
embed_model = fine_tuned_model.train(list_of_files, "BAAI/bge-small-en-v1.5", llm, "fintune")
```

### **(Optional) Step 4 : Loading the model**&#x20;

Optionally, If you have already fine-tuned your model and utilize it again, you can do so with the `load_model` function

**Parameters:**

* **Path :** The path where you saved the model after fine-tuning

```
# Option to load an already fine-tuned model
embed_model = fine_tuned_model.load_model("fintune")
```

### **Step 5 : Voila, Use your embedding model**

Setup your retriever using the fine-tuned model and use it in your use case.&#x20;

```
retriever = retrieve.auto_retriever(data, embed_model, type="normal", top_k=4)
```
103 changes: 103 additions & 0 deletions docs/advanced-rag/hybrid-retrievers.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,103 @@
# 🔀 Hybrid Retrievers

## Enhancing Retrieval Accuracy

This retriever combines the strengths of vector similarity search and keyword-based search. By seamlessly blending these approaches, it retrieves documents that not only align semantically with the query but also encompass relevant keywords. The result is a more holistic and comprehensive set of results, enhancing the overall effectiveness of information retrieval.

## Code Example: Hybrid Retrievers

This example demonstrates the use of a Hybrid retriever, using evaluation steps for both retriever performance and LLM response quality.

### 1. Load and Process the Data

The fit function processes and prepares your data for indexing and retrieval. It offers a unified interface for loading and processing data regardless of the source type. Here we are using a pdf file for retrieval purposes.

```python
# fit the data from the pdf file
from beyondllm.source import fit

data = fit(path="path/to/your/pdf/file.pdf", dtype="pdf", chunk_size=512, chunk_overlap=100)
```

### 2. Load Embedding Model

The chosen embedding model generates vector representations of the text data extracted by the fit function. These embeddings capture the semantic meaning of the text and enable efficient similarity search during retrieval.&#x20;

Here we are using `all-MiniLM-L6-v2` model from the HuggingFace hub.

```python
# Load the embedding model from Hugging Face Hub
from beyondllm.embeddings import HuggingFaceEmbeddings

embed_model = HuggingFaceEmbeddings(model_name="sentence-transformers/all-MiniLM-L6-v2")
```

### 3. Initialize Retriever with Hybrid Search

The Auto Retriever in BeyondLLM simplifies information retrieval by abstracting complexity, enabling easy configuration of retrieval types and re-rankers. With a single line, it efficiently fetches relevant documents or passages based on user queries, utilizing embeddings for similarity search.&#x20;

```python
# Initialize Retriever with Hybrid search
from beyondllm.retrieve import auto_retriever

retriever = auto_retriever(
data=data,
embed_model=embed_model,
type="hybrid",
top_k=5,
mode="OR"
)
```

### 4. Load LLM for Evaluation and Generation

The LLM serves two purposes:

* **Evaluation:** It generates question-answer pairs from the knowledge base to assess the retriever's performance.
* **Generation:** It will be used later to generate responses to user queries based on the retrieved information.

Here we are using the `zephyr-7b-beta` model from the HuggingFace hub.

```python
# Load the LLM model from HuggingFace Hub
from beyondllm.llms import HuggingFaceHubModel

llm = HuggingFaceHubModel(model="HuggingFaceh4/zephyr-7b-beta", token="your_huggingfacehub_token", model_kwargs={"max_new_tokens":512,"temperature":0.1})
```

### 5. Evaluate Retriever Performance

The evaluate function measures the retriever's effectiveness using the generated Question-Answer pairs. It calculates the hit rate (percentage of queries where a relevant document is retrieved) and MRR (mean reciprocal rank of the first relevant document) to quantify retrieval accuracy.

```python
# Evaluate the LLM model
results = retriever.evaluate(llm)

print(f"Reranker Hit Rate and MRR: {results}")
```

### 6. Generate Response and Evaluate LLM Output

This step simulates a user query and generates a response using the BeyondLLM pipeline. The Generate class combines the retriever and LLM to fetch relevant information and formulate an answer. Additionally, the RAG Triad evaluations assess the quality of the LLM's response.

```python
# Generate text using the LLM model
from beyondllm.generator import Generate

pipeline = Generate(question="what is the pdf mentioning about?", retriever=retriever, llm=llm)
print(pipeline.call()) # AI response

print(pipeline.get_rag_triad_evals()) # Evaluate LLM response quality
```

## Explanation of Evaluation Outputs:

* **Retriever Evaluation:** The hit rate and MRR provide insights into the retriever's ability to locate relevant information.
* **RAG Triad Evaluations:**
* **Context Relevancy:** Measures how well the retrieved information relates to the user query.
* **Answer Relevancy:** Assesses the relevance of the generated response to the user query.
* **Groundedness:** Evaluates whether the generated response is supported by the retrieved information and avoids hallucination.

{% hint style="info" %}
**Remember:** Experiment with different re-ranker models and retrieval parameters to optimize your BeyondLLM application for your specific use case and data characteristics.
{% endhint %}
Loading

0 comments on commit 5c60843

Please sign in to comment.