Updated for release ver 0.2.0

aiplanethub · May 9, 2024 · 5c60843 · 5c60843
2 parents 6095c09 + f93bb42
commit 5c60843
Show file tree

Hide file tree

Showing 34 changed files with 2,318 additions and 49 deletions.
diff --git a/.github/workflows/pr_workflow.yml b/.github/workflows/pr_workflow.yml
@@ -18,11 +18,7 @@ jobs:
         run: |
           python -m pip install --upgrade pip
           pip install -r requirements.txt
-      - name: Lint with flake8
-        run: |
-          pip install flake8
-          flake8 --ignore=E402,F401 .
       - name: Test with pytest
         run: |
           pip install pytest
-          pytest
+          pytest
diff --git a/cookbook/finetuning_embedding_model.py b/cookbook/finetuning_embedding_model.py
@@ -0,0 +1,31 @@
+from beyondllm import source, retrieve, llms
+from beyondllm.embeddings import FineTuneEmbeddings
+import os
+
+
+# Setting up an environment variable for API key
+os.environ['GOOGLE_API_KEY'] = "your-api-key"
+
+# Importing and preparing the data
+data = source.fit("build-career-in-ai.pdf", dtype="pdf", chunk_size=1024, chunk_overlap=0)
+
+# List of files to train the embeddings
+list_of_files = ['build-career-in-ai.pdf']
+
+# Initializing a Gemini LLM model
+llm = llms.GeminiModel()
+
+# Creating an instance of FineTuneEmbeddings
+fine_tuned_model = FineTuneEmbeddings()
+
+# Training the embedding model
+embed_model = fine_tuned_model.train(list_of_files, "BAAI/bge-small-en-v1.5", llm, "fintune")
+
+# Option to load an already fine-tuned model
+# embed_model = fine_tuned_model.load_model("fintune")
+
+# Creating a retriever using the fine-tuned embeddings
+retriever = retrieve.auto_retriever(data, embed_model, type="normal", top_k=4)
+
+# Retrieving information using a query
+print(retriever.retrieve("How to excel in AI?"))
diff --git a/docs/.gitbook/assets/Thumbnails.png b/docs/.gitbook/assets/Thumbnails.png
diff --git a/docs/README.md b/docs/README.md
@@ -0,0 +1,40 @@
+# 📄 Overview
+
+We at AI Planet are excited to introduce [BeyondLLM](https://github.com/aiplanethub/beyondllm), an open-source framework designed to streamline the development of RAG and LLM applications, complete with evaluations, all in just 5-7 lines of code.&#x20;
+
+Yes, you read that correctly. Only 5-7 lines of code.&#x20;
+
+Let's understand what and why one needs BeyondLLM.
+
+<figure><img src=".gitbook/assets/Thumbnails.png" alt=""><figcaption><p>Build-Experiment-Evaluate-Repeat</p></figcaption></figure>
+
+### Why BeyondLLM?
+
+#### Easily build RAG and Evals in 5 lines of code
+
+* Building a robust RAG (Retrieval-Augmented Generation) system involves integrating `various components` and managing associated `hyperparameters`. BeyondLLM offers an optimal framework for `quickly experimenting with RAG applications`.&#x20;
+* With components like `source` and `auto_retriever`, which support several parameters, most of the integration work is automated, eliminating the need for manual coding.&#x20;
+* Additionally, we are actively working on enhancing features such as hyperparameter tuning for RAG applications, addressing the next key aspect of our development roadmap.
+
+#### Customised Evaluation Support
+
+* The evaluation of RAG in the market largely relies on the OpenAI API Key and closed-source LLMs. However, with BeyondLLM, you have the flexibility to select any LLM for evaluating both LLMs and embeddings.&#x20;
+* We offer support for `2 evaluation metrics` for embeddings: `Hit rate` and `MRR (Mean Reciprocal Rank)`, allowing users to choose the most suitable model based on their specific needs.
+* Additionally, we provide `4 evaluation metrics` for assessing `Large Language Models` across various criteria, in line with current research standards.
+
+#### Various Custom LLMs support tailoring the basic needs
+
+* HuggingFace: Easily accessible for everyone to access Open Source LLMs
+* Ollama: Run LLMs locally
+* Gemini: (default LLM): Run Multimodal applications
+* OpenAI: Powerful chat model LLM with best quality response
+* Azure: For 32K large context good response quality support.
+
+#### Reduce LLM Hallucination&#x20;
+
+* Certainly, the primary objective is to minimize or eliminate hallucinations within the RAG framework.&#x20;
+* To support this goal, we've developed the `Advanced RAG section`, facilitating rapid experimentation for constructing RAG pipelines with reduced hallucination risks.&#x20;
+* BeyondLLM features, including source and auto\_retriever, incorporate functionalities such as `Markdown splitter`, `chunking strategies`, `Re-ranking (Cross encoders and flag embedding)` and `Hybrid Search`, enhancing the reliability of RAG applications.&#x20;
+* It's worth noting Andrej Karpathy's insight: "[Hallucination is a LLM's greatest feature and not a bug](http://twitter.com/karpathy/status/1733299213503787018)," underscoring the inherent capabilities of language models.
+
+Done talking, lets build.&#x20;
diff --git a/docs/SUMMARY.md b/docs/SUMMARY.md
@@ -0,0 +1,42 @@
+# Table of contents
+
+## Getting started
+
+* [📄 Overview](README.md)
+* [🔧 Installation](getting-started/installation.md)
+* [🚀 Quickstart Guide](getting-started/quickstart-guide.md)
+
+## Core Components
+
+* [🌐 Source](core-components/source.md)
+* [🧬 Embeddings](core-components/embeddings.md)
+* [🤖 Auto Retriever](core-components/auto-retriever/README.md)
+  * [🔫 Evaluate retriever](core-components/auto-retriever/evaluate-retriever.md)
+* [💼 Vector Store](core-components/vector-store.md)
+* [🧠 LLMs](core-components/llms.md)
+* [🔋 Generator](core-components/generator.md)
+* [📊 Evaluation](core-components/evaluation.md)
+
+## Advanced RAG&#x20;
+
+* [📚 Re-ranker Retrievers](advanced-rag/re-ranker-retrievers.md)
+* [🔀 Hybrid Retrievers](advanced-rag/hybrid-retrievers.md)
+* [🥶 Finetune Embeddings](advanced-rag/finetune-embeddings.md)
+
+## Use Cases
+
+* [💬 Chat with PowerPoint Presentation](use-cases/chat-with-powerpoint-presentation.md)
+* [🔍 Document Search and Chat](use-cases/document-search-and-chat.md)
+* [🤖 Customer Service Bot](use-cases/customer-service-bot.md)
+* [🗣️ Multilingual RAG](use-cases/multilingual-rag.md)
+
+## How to Guides
+
+* [➕ How to add new LLM?](how-to-guides/how-to-add-new-llm.md)
+* [➕ How to add new Embeddings?](how-to-guides/how-to-add-new-embeddings.md)
+* [➕ How to add a new Loader?](how-to-guides/how-to-add-a-new-loader.md)
+
+## Community Spotlight
+
+* [🔄 Share your work](community-spotlight/share-your-work.md)
+* [👏 Acknowledgements](community-spotlight/acknowledgements.md)
diff --git a/docs/advanced-rag/finetune-embeddings.md b/docs/advanced-rag/finetune-embeddings.md
@@ -0,0 +1,65 @@
+# 🥶 Finetune Embeddings
+
+Beyondllm lets you fine-tune embedding models on your own data to achieve more accurate and better results. \
+\
+You can fine-tune any model available on the [Hugging Face](https://huggingface.co/)&#x20;
+
+### **Step 1 : Importing Modules**
+
+You need an LLM to generate QA pairs for fine-tuning and FineTuneEmbeddings module to fine-tune the model.
+
+```
+from beyondllm.llms import GeminiModel
+from beyondllm.embeddings import FineTuneEmbeddings
+
+# Initializing llm
+llm = llms.GeminiModel()
+
+# calling the finetuning engine
+fine_tuned_model = FineTuneEmbeddings()
+```
+
+### **Step 2 : Data to FineTune**
+
+You need data to fine-tune your model, It could be 1 or more files so you need to make a list of all the files you want to train your model on.
+
+```
+list_of_files = ['your-file-here-1', 'your-file-here-2']
+```
+
+### **Step 3 : Training the Model**
+
+Once everything is ready you start training by using the `train` function in FineTuneEmbeddings. &#x20;
+
+**Parameters:**
+
+* **Files :** The list of files you want to train your model on.
+* **Model name :** The model you want to fine-tune.&#x20;
+* **LLM :** Language model to generate the dataset for fine-tuning.&#x20;
+* **Output path :** The path where your embedding model will be saved.&#x20;
+
+```
+# Training the embedding model
+embed_model = fine_tuned_model.train(list_of_files, "BAAI/bge-small-en-v1.5", llm, "fintune")
+```
+
+### **(Optional)  Step 4 : Loading the model**&#x20;
+
+Optionally, If you have already fine-tuned your model and utilize it again, you can do so with the `load_model` function
+
+**Parameters:**
+
+* **Path :** The path where you saved the model after fine-tuning
+
+```
+# Option to load an already fine-tuned model
+embed_model = fine_tuned_model.load_model("fintune")
+```
+
+### **Step 5 : Voila, Use your embedding model**
+
+Setup your retriever using the fine-tuned model and use it in your use case.&#x20;
+
+```
+retriever = retrieve.auto_retriever(data, embed_model, type="normal", top_k=4)
+```
diff --git a/docs/advanced-rag/hybrid-retrievers.md b/docs/advanced-rag/hybrid-retrievers.md
@@ -0,0 +1,103 @@
+# 🔀 Hybrid Retrievers
+
+## Enhancing Retrieval Accuracy
+
+This retriever combines the strengths of vector similarity search and keyword-based search. By seamlessly blending these approaches, it retrieves documents that not only align semantically with the query but also encompass relevant keywords. The result is a more holistic and comprehensive set of results, enhancing the overall effectiveness of information retrieval.
+
+## Code Example: Hybrid Retrievers
+
+This example demonstrates the use of a Hybrid retriever, using evaluation steps for both retriever performance and LLM response quality.
+
+### 1. Load and Process the Data
+
+The fit function processes and prepares your data for indexing and retrieval. It offers a unified interface for loading and processing data regardless of the source type. Here we are using a pdf file for retrieval purposes.
+
+```python
+# fit the data from the pdf file
+from beyondllm.source import fit
+
+data = fit(path="path/to/your/pdf/file.pdf", dtype="pdf", chunk_size=512, chunk_overlap=100)
+```
+
+### 2. Load Embedding Model
+
+The chosen embedding model generates vector representations of the text data extracted by the fit function. These embeddings capture the semantic meaning of the text and enable efficient similarity search during retrieval.&#x20;
+
+Here we are using `all-MiniLM-L6-v2` model from the HuggingFace hub.
+
+```python
+# Load the embedding model from Hugging Face Hub
+from beyondllm.embeddings import HuggingFaceEmbeddings
+
+embed_model = HuggingFaceEmbeddings(model_name="sentence-transformers/all-MiniLM-L6-v2")
+```
+
+### 3. Initialize Retriever with Hybrid Search
+
+The Auto Retriever in BeyondLLM simplifies information retrieval by abstracting complexity, enabling easy configuration of retrieval types and re-rankers. With a single line, it efficiently fetches relevant documents or passages based on user queries, utilizing embeddings for similarity search.&#x20;
+
+```python
+# Initialize Retriever with Hybrid search
+from beyondllm.retrieve import auto_retriever
+
+retriever = auto_retriever(
+    data=data, 
+    embed_model=embed_model, 
+    type="hybrid", 
+    top_k=5,
+    mode="OR"
+)
+```
+
+### 4. Load LLM for Evaluation and Generation
+
+The LLM serves two purposes:
+
+* **Evaluation:** It generates question-answer pairs from the knowledge base to assess the retriever's performance.
+* **Generation:** It will be used later to generate responses to user queries based on the retrieved information.
+
+Here we are using the `zephyr-7b-beta` model from the HuggingFace hub.
+
+```python
+# Load the LLM model from HuggingFace Hub
+from beyondllm.llms import HuggingFaceHubModel
+
+llm = HuggingFaceHubModel(model="HuggingFaceh4/zephyr-7b-beta", token="your_huggingfacehub_token", model_kwargs={"max_new_tokens":512,"temperature":0.1})
+```
+
+### 5. Evaluate Retriever Performance
+
+The evaluate function measures the retriever's effectiveness using the generated Question-Answer pairs. It calculates the hit rate (percentage of queries where a relevant document is retrieved) and MRR (mean reciprocal rank of the first relevant document) to quantify retrieval accuracy.
+
+```python
+# Evaluate the LLM model
+results = retriever.evaluate(llm)
+
+print(f"Reranker Hit Rate and MRR: {results}")
+```
+
+### 6. Generate Response and Evaluate LLM Output
+
+This step simulates a user query and generates a response using the BeyondLLM pipeline. The Generate class combines the retriever and LLM to fetch relevant information and formulate an answer. Additionally, the RAG Triad evaluations assess the quality of the LLM's response.
+
+```python
+# Generate text using the LLM model
+from beyondllm.generator import Generate
+
+pipeline = Generate(question="what is the pdf mentioning about?", retriever=retriever, llm=llm)
+print(pipeline.call())  # AI response
+
+print(pipeline.get_rag_triad_evals())  # Evaluate LLM response quality
+```
+
+## Explanation of Evaluation Outputs:
+
+* **Retriever Evaluation:** The hit rate and MRR provide insights into the retriever's ability to locate relevant information.
+* **RAG Triad Evaluations:**
+  * **Context Relevancy:** Measures how well the retrieved information relates to the user query.
+  * **Answer Relevancy:** Assesses the relevance of the generated response to the user query.
+  * **Groundedness:** Evaluates whether the generated response is supported by the retrieved information and avoids hallucination.
+
+{% hint style="info" %}
+**Remember:** Experiment with different re-ranker models and retrieval parameters to optimize your BeyondLLM application for your specific use case and data characteristics.
+{% endhint %}