diff --git a/Presentation RAG.pptx b/Presentation RAG.pptx new file mode 100644 index 0000000..cad15df Binary files /dev/null and b/Presentation RAG.pptx differ diff --git a/RAG langchain model report.pdf b/RAG langchain model report.pdf new file mode 100644 index 0000000..e322b60 Binary files /dev/null and b/RAG langchain model report.pdf differ diff --git a/README.md b/README.md index bd0b32c..520f0a8 100644 --- a/README.md +++ b/README.md @@ -1,115 +1,55 @@ -![logo_ironhack_blue 7](https://user-images.githubusercontent.com/23629340/40541063-a07a0a8a-601a-11e8-91b5-2f13e4e6b441.png) - -# Retrieval Augmented Generation (RAG) Challenge - -## Introduction -Retrieval Augmented Generation (RAG) is a novel approach that combines the strengths of retrieval-based and generation-based models to provide accurate and contextually relevant responses. By leveraging a vector database to retrieve relevant documents and a large language model (LLM) to generate responses, RAG can significantly enhance the capabilities of applications in various domains such as customer support, knowledge management, and content creation. - -## Project Overview - -This project is structured to provide hands-on experience in implementing a RAG system. Students will work through stages from dataset selection to connection to external artefacts (VectorDB, APIs), gaining a comprehensive understanding of RAG’s components and their integration. - -### 1. Dataset Selection - -Select a dataset suitable for your RAG application. Possible options include: -- **Learning Material**: A collection of books, slide decks on a specific topic -- **News articles**: A dataset containing articles on various topics. -- **Product Reviews**: Reviews of products along with follow-up responses. - -**Bonus:** Consider using Multimodal datasets like text+images or text+audio - -Check the end of this file for dataset examples - -### 2. Exploratory Data Analysis (EDA) -Perform an EDA on the chosen dataset to understand its structure, content, and the challenges it presents. Document your findings and initial thoughts on how the data can be leveraged in a RAG system. - -### 3. Embedding and Storing Chunks - -#### 3.A Embed Your Chunks of Documents -- **Objective**: Transform your chunks of documents into embeddings that can be stored in a VectorDB. -- **Suggested Tool**: [all-MiniLM-L6-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2) (for English content). - -**Bonus** Consider using the Embedding model from OpenAI, just be attentive to costs. - -#### 3.B Connection to Vector DB -- **Objective**: Connect to a vector database to store and retrieve document embeddings. -- **Suggested Tool**: [ChromaDB](https://www.trychroma.com/). -- **Steps**: - 1. Pre-process the dataset to generate embeddings for each document using a suitable model (e.g., Sentence Transformers). - 2. Store these embeddings in ChromaDB. - 3. Implement retrieval logic to fetch relevant documents based on a query. - -**Bonus:** Consider using a Cloud service to store your embeddings like Azure AI Search or Weaviate. Be attentive to potential costs. - -#### 3.C AI Frameworks -- **Consider Using**: Frameworks like [LangChain](https://python.langchain.com/docs/integrations/vectorstores/chroma) and [LlamaIndex](https://gpt-index.readthedocs.io/en/latest/examples/vector_stores/ChromaIndexDemo.html) for easier integration. - -### 4. Connecting to LLM -- **Objective**: Connect to a Large Language Model to generate responses based on retrieved documents. -- **Suggested Tool**: [OpenAI API](https://platform.openai.com/docs/api-reference/introduction). -- **Steps**: - 1. Set up access to the OpenAI API or an alternative LLM API. - 2. Develop the logic to combine retrieved documents with the query to generate a response. - 3. Implement and test the end-to-end RAG pipeline. - -- **Bonus**: Connect to an API through a cloud service like AzureOpenAI, AWS Bedrock, or Google Vertex AI. Please note that the setup for this will be much more complex and not all might have a free tier model. - -### 5. Evaluation -- **Objective**: Evaluate the performance of your RAG system in two ways. - 1. **Yourself**: Test the system multiple times to understand its performance and usability. - 2. **LLM as a judge (Bonus)**: Use an LLM as a judge to generate questions and evaluate your RAG's answers. -- **Steps**: - 1. Create a test set of queries and expected responses. - 2. Measure the performance of your RAG system against these queries. - 3. Analyze and document the strengths and weaknesses of your system. - -### 6. Deployment (Bonus) -- **Objective**: Deploy the RAG system as a web application or API. -- **Tools**: Use frameworks like Flask or FastAPI for the backend and Streamlit for the frontend. -- **Steps**: - 1. Develop a simple web interface to interact with your RAG system. - 2. Deploy the application on a cloud platform such as AWS, GCP, or Heroku. - -## Resources -- [ChromaDB Documentation](https://www.trychroma.com/docs) -- [OpenAI API Documentation](https://platform.openai.com/docs/api-reference/introduction) -- [Sentence Transformers](https://www.sbert.net/) -- [Flask](https://flask.palletsprojects.com/) -- [Streamlit](https://streamlit.io/) - -## Deliverables -1. **Python Code**: Provide well-documented Python code implementing the RAG system. -2. **Report**: Submit a detailed report documenting your EDA findings, connection setups, evaluation metrics, and conclusions about the system's performance. -3. **Presentation**: Prepare a short presentation covering the project, from dataset analysis to the final evaluation. Include visual aids such as charts and example responses. - -## Bonus -- **Interactive Demo**: Provide an interactive demo of your RAG system during the presentation. - -This project will equip you with practical skills in implementing and evaluating a Retrieval Augmented Generation system, preparing you for advanced applications in the field of natural language processing. - ---- - -# Retrieval-Augmented Generation (RAG) Demo Project Datasets - -For this demo project, students will explore the capabilities of Retrieval-Augmented Generation (RAG) systems. Below is a curated list of datasets suitable for various RAG applications, including question-answering, semantic search, and response generation. - -## Datasets - -### 1. [Common Crawl (News and Web Data)](https://github.com/commoncrawl/) - - **Description**: This dataset comprises web-scraped data from a wide array of sources. It's excellent for general knowledge retrieval tasks and question-answering. - -### 2. [Paperswithcode Text Datasets](https://paperswithcode.com/datasets?mod=texts&page=1) - - **Description**: Portal with many datasets that can be applied to RAG. - -### 3. [Biology scientific papers](https://www.researchgate.net/topic/Biological-Science/publications) -- **Description**: Download a few Biology papers to build a RAG system on Biology topics - -### 4. [Puerto Rico news articles](https://github.com/ironhack-labs/project-5-2-genai-rag/data) -- **Description**: 15 years of crawled Puerto Rico news articles about the region. - -### 5. [Financial Laws Collection](https://github.com/ironhack-labs/project-5-2-genai-rag/data) -- **Description**: Collection of 11 documents on Financial legistaltion in Europe. - ---- - -Each of these datasets provides a unique opportunity to experiment with RAG systems and explore how retrieval impacts the quality and relevance of generated responses. \ No newline at end of file +## **RAG Langchain Model with OpenAI API** + +### _Project Overview_ +This repository contains the code and documentation for a Retrieval Augmented Generation (RAG) model, developed by Dani Siaj and Carlos Rodríguez. This model enables users to upload a PDF document, ask questions, and receive coherent, complete, and relevant responses generated by an integrated large language model (LLM). + +The RAG model dynamically generates a prompt from the user's query, incorporating instructions, context, and restrictions to create specific, contextually aware responses. + +### _Content_ +The uploaded PDF is a 9-page document containing information on food allergies, symptoms, and management, sourced from the American College of Allergy, Asthma, and Immunology (ACAAI). This document includes only textual content—no tables or images are present. + +### _Model Architecture_ +#### Model Selection +The model architecture is centered around OpenAIEmbeddings API as the text transformer. Key libraries used include: + +* Langchain: For text extraction and model chaining. +* Chroma DB: To create and manage the vector store. + +#### Components +* Document Loader: PyPDFLoader handles document uploads and text parsing. +* Embeddings: OpenAIEmbeddings transforms text into vector representations. +* Text Extraction: RecursiveTextCharacterSplitter and ChromaDB handle text processing and vectorization. + +### _Chain Architecture_ +### * _Retrieval of Information_: +User queries retrieve a set of k documents (where k=3 in the code) from the ChromaDB vector store using similarity_search(). +### * _Prompt Engineering_: +* A context is built using the selected documents. +* This context is passed to the dynamic prompt-generating function. +* A specific, context-aware prompt is created based on the user’s query. +### * _LLM Implementation_: +The prompt is sent to the LLM via the OpenAI API to generate the desired response. +### _Model Evaluation_ +A second LLM model is used as a "judge" to evaluate the generated responses based on the following criteria: + + * Relevance(0-5) + * Accuracy(0-5) + * Completeness(0-5) + * Clarity(0-5) + +Through prompt engineering, a dedicated evaluation prompt is used to assess the quality of each response. + +### _Streamlit App_ +The model is deployed on Streamlit, providing a user-friendly interface. Users can input questions and receive responses formatted in Markdown, followed by the LLM-based evaluation. This design enhances user experience by providing both a direct answer and an automated quality assessment. + +### _Conclusions_ +Conclusion 1: The RAG model demonstrated high efficiency in terms of response time and relevance to user queries. +Conclusion 2: The limited size of the document restricts extensive testing. Future evaluations will include larger files for a more comprehensive assessment. + +### _Repository_ +* Data folder: where the PDF documents and the Chroma DB is stored +* main.py where all the code is organized +* Pptx presentation +* ReadMe.md +* Requirements.txt with all the neccessary libraries for this project +* Streamlit_RAG.py to deploy the code in Streamlit platform and test it in an application \ No newline at end of file diff --git a/data/References for Evaluation.csv b/data/References for Evaluation.csv new file mode 100644 index 0000000..d273b72 --- /dev/null +++ b/data/References for Evaluation.csv @@ -0,0 +1,21 @@ +Question,Answer +"What are the most common food allergens?","The most common food allergens include milk, eggs, peanuts, tree nuts, fish, shellfish, wheat, soy, and sesame." +"Can you outgrow food allergies?","Yes, children may outgrow allergies to milk, egg, soy, and wheat, but peanut, tree nut, fish, and shellfish allergies often persist." +"How is a food allergy diagnosed?","Diagnosis involves a medical history review, symptom documentation, skin or blood tests to check for food-specific IgE antibodies, and sometimes an oral food challenge." +"What is anaphylaxis?","Anaphylaxis is a severe, life-threatening allergic reaction that can impair breathing, cause a drop in blood pressure, and may be fatal without prompt treatment." +"How can I prevent food allergies?","Prevention strategies include delaying the introduction of solid foods to young infants and introducing peanut-containing foods to high-risk infants around 4-6 months." +"What treatments are available for food allergies?","Currently, the main treatment is avoidance of allergenic foods. There are new therapies such as Palforzia for peanut allergies and a skin patch under FDA review." +"Can food allergens remain on objects?","Yes, food allergens can remain on surfaces and may cause a skin reaction if touched, but severe reactions occur primarily from ingestion." +"Can you develop food allergies as an adult?","Yes, food allergies can develop in adulthood, most commonly to shellfish, tree nuts, peanuts, and fish." +"What symptoms indicate a food allergy?","Symptoms can range from hives, swelling, gastrointestinal distress, to more severe reactions like anaphylaxis." +"How long do food allergy symptoms take to appear?","Symptoms often appear within minutes to two hours of ingestion but can be delayed in some cases, especially in children." +"What is oral allergy syndrome?","Oral allergy syndrome is a reaction caused by cross-reactive allergens found in pollen and certain foods, leading to itchiness in the mouth or throat." +"Is gluten allergy common?","There is no actual gluten allergy; however, wheat allergy and celiac disease are related conditions. Celiac disease is serious and requires strict gluten avoidance." +"How can I manage food allergies?","Management involves strict avoidance of allergens, reading food labels, and using an epinephrine auto-injector for emergencies." +"How do I use an epinephrine auto-injector?","Administer the auto-injector at the first sign of a severe allergic reaction. Ensure you're familiar with the device and have easy access to it." +"How expensive is food allergy testing?","Costs for testing vary widely based on the procedure and insurance coverage. It’s typically conducted for individuals with a history of reactions." +"Are there any dietary restrictions for allergens?","Yes, individuals must avoid foods known to cause allergic reactions and may need to relay this information in dining situations." +"What are cross-reactive allergens?","Cross-reactive allergens are similar proteins that can cause a reaction in those allergic to a related food, such as tree nuts and peanuts." +"Can food allergies cause gastrointestinal issues?","Yes, food allergies can lead to gastrointestinal reactions like vomiting, diarrhea, and abdominal pain as part of the allergic response." +"What are precautionary labeling statements?","Precautionary labeling statements indicate potential allergen contamination but lack standard definitions, so their meanings can vary." +"What should I do in case of a severe allergic reaction?","Use epinephrine immediately, call emergency services, and seek medical treatment even if symptoms seem to improve." \ No newline at end of file diff --git a/data/allergies-doc.pdf b/data/allergies-doc.pdf new file mode 100644 index 0000000..ad345a7 Binary files /dev/null and b/data/allergies-doc.pdf differ diff --git a/data/allergies_ok.pdf b/data/allergies_ok.pdf new file mode 100644 index 0000000..8b11e39 Binary files /dev/null and b/data/allergies_ok.pdf differ diff --git a/data/chroma_db/chroma.sqlite3 b/data/chroma_db/chroma.sqlite3 new file mode 100644 index 0000000..acf6d82 Binary files /dev/null and b/data/chroma_db/chroma.sqlite3 differ diff --git a/main.ipynb b/main.ipynb new file mode 100644 index 0000000..dfc32ca --- /dev/null +++ b/main.ipynb @@ -0,0 +1,982 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Gen AI RAG Project" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Import Necessary Libraries" + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "metadata": {}, + "outputs": [], + "source": [ + "import pandas as pd\n", + "import numpy as np\n", + "\n", + "from openai import OpenAI\n", + "\n", + "## OPEN AI EMBEDDINGS:\n", + "from langchain_openai import OpenAIEmbeddings\n", + "import os\n", + "from langchain_community.vectorstores import Chroma\n", + "from langchain_community.document_loaders import PyPDFLoader\n", + "from langchain_community.embeddings.sentence_transformer import (\n", + " SentenceTransformerEmbeddings,\n", + ")\n", + "from langchain_text_splitters import RecursiveCharacterTextSplitter, CharacterTextSplitter\n", + "#from langchain_experimental.text_splitter import SemanticChunker\n", + "\n", + "from IPython.display import display, HTML, Markdown\n", + "\n", + "\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Initialize Embeddings from OpenAI" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Cell finished\n" + ] + } + ], + "source": [ + "API_KEY = \"\"\n", + "\n", + "# Create the embeddings function\n", + "embeddings = OpenAIEmbeddings(model=\"text-embedding-3-small\", api_key = API_KEY)\n", + "\n", + "# create a text splitter\n", + "text_splitter = RecursiveCharacterTextSplitter(chunk_size=700, chunk_overlap=50, )\n", + "print('Cell finished')" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 1. Load Data" + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "metadata": {}, + "outputs": [], + "source": [ + "# load the document and split it into chunks\n", + "document_dir = \"./data\"\n", + "filename = \"allergies_ok.pdf\"\n", + "file_path = os.path.join(document_dir, filename)\n", + "\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# 2. EDA" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# 3. Divide into chunks" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "pages = PyPDFLoader(file_path).load_and_split() # Split the document in pages\n", + "\n", + "docs = text_splitter.split_documents(pages) # Split the pages into chunks\n" + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "page_content='Overview \n", + "Millions of Americans have an allergy of some kind. You probably know one of those \n", + "people or are one yourself. Almost 6% of U.S. adults and children have a food allergy. \n", + "Food allergy symptoms are most common in babies and children, but they can appear at \n", + "any age. You can even develop an allergy to foods you have eaten for years with no \n", + "problems. \n", + " \n", + "Signs of Allergies \n", + "The body’s immune system keeps you healthy by fighting off infections and other dangers \n", + "to good health. A food allergy reaction occurs when your immune system overreacts to a \n", + "food or a substance in a food, identifying it as a danger and triggering a protective \n", + "response.' metadata={'source': './data\\\\allergies_ok.pdf', 'page': 0}\n" + ] + } + ], + "source": [ + "print(docs[0])" + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\n" + ] + } + ], + "source": [ + "\n", + "print(type(docs[0]))\n", + "\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# 4. Embeddings into Chroma DB" + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Cell finished\n" + ] + } + ], + "source": [ + "# Load embeddings and save them into Chroma\n", + "db = Chroma.from_documents(docs, embeddings, persist_directory=\"./allergy_chroma_db\")\n", + "print('Cell finished')" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# 5. Obtain the k number of most similar results to the user's query" + ] + }, + { + "cell_type": "code", + "execution_count": 8, + "metadata": {}, + "outputs": [], + "source": [ + "user_question = input(\"Ask a question about allergies: \")\n", + "docs = db.similarity_search(user_question, k=3)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# 6. Build the prompt based on the similarity search results" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Build function to create the content for the prompt" + ] + }, + { + "cell_type": "code", + "execution_count": 9, + "metadata": {}, + "outputs": [], + "source": [ + "def _get_document_context(docs):\n", + " context = '\\n'\n", + " for doc in docs:\n", + " context += '\\nContext:\\n'\n", + " context += doc.page_content + '\\n\\n'\n", + " return context" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Builld a dynamic prompt including the context based on the results from the query" + ] + }, + { + "cell_type": "code", + "execution_count": 27, + "metadata": {}, + "outputs": [], + "source": [ + "def generate_prompt(user_question, docs):\n", + " \"\"\"\n", + " This functions uses a template to generate a dynamic prompt that can be adapted to the user's query\n", + "\n", + " Arguments: user_question: str, docs :str\n", + " \"\"\"\n", + " prompt = f\"\"\"\n", + " INTRODUCTION\n", + " You are a knowledgeable assistant trained to answer questions about allergies, symptoms, and management strategies. Your responses should be clear, concise, and focused on accurate information.\n", + "\n", + " The user asked: \"{user_question}\"\n", + "\n", + " CONTEXT\n", + " Technical documentation for allergies, symptoms, and management of allergen ingestion:\n", + " '''\n", + " {_get_document_context(docs)}\n", + " '''\n", + "\n", + " RESTRICTIONS\n", + " Always refer to products or allergens by their specific names as mentioned in the documentation.\n", + " Stick to facts and provide clear, evidence-based responses; avoid opinions or interpretations.\n", + " Only respond if the answer can be found within the context. If not, let the user know that the information is not available.\n", + " Do not engage in topics outside allergies, symptoms, and related health matters. Avoid humor, sensitive topics, and speculative discussions.\n", + " If the user’s question lacks sufficient details, request clarification rather than guessing the answer. For example, if the user does not ask anything related to allergies, allergies symptoms, or allergies management, you should request clarification.\n", + " EXAMPLE:\n", + " example 1:\n", + " User: 'I ate eggs'\n", + " Agent: 'I hope they tasted amazing. Are you allergic to eggs?'\n", + "\n", + " example 2: \n", + " User: 'I think I have an allergy to eggs'\n", + " Agent: 'Egg allergies are common and can cause a range of symptoms, from mild to more severe reactions. Here are some typical signs and management steps:\n", + " Symptoms of an Egg Allergy\n", + " Mild Reactions: Skin reactions like hives, eczema, or redness; digestive issues such as cramps, nausea, or vomiting; and runny nose or sneezing.\n", + " Severe Reactions (Anaphylaxis): Difficulty breathing, swelling of the throat, rapid pulse, dizziness, or loss of consciousness.\n", + " If you experience severe symptoms, you should seek medical help immediately, as anaphylaxis requires prompt treatment.\n", + "\n", + " Management and Avoidance Tips\n", + " Avoid Egg-Based Foods: Eggs can be hidden in foods, so check labels for ingredients like “albumin” or “lysozyme” that indicate eggs.\n", + " Consider Egg Substitutes: For baking, substitutes like applesauce, banana, or commercial egg replacers can be helpful.\n", + " Discuss with Your Doctor: They may suggest an allergy test to confirm the allergy or advise on an emergency plan, such as carrying an epinephrine auto-injector if needed.\n", + " If you’re experiencing ongoing symptoms or suspect an allergy, consulting with an allergist is recommended for personalized advice and treatment.\n", + " TASK\n", + " Provide a direct answer based on the user’s question, if possible.\n", + " Guide the user to relevant sections of the documentation if additional context is needed.\n", + "\n", + " EXAMPLES:\n", + " RESPONSE STRUCTURE:\n", + " '''\n", + " # [Answer Title]\n", + " [answer text]\n", + " '''\n", + " CONVERSATION:\n", + " User: {user_question}\n", + " Agent:\n", + " \"\"\"\n", + " return prompt" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# 6. Initialize OpenAI client/Assistant" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [] + }, + { + "cell_type": "code", + "execution_count": 15, + "metadata": {}, + "outputs": [], + "source": [ + "client = OpenAI(api_key = API_KEY)\n", + "\n", + "#messages = [{'role':'user', 'content':prompt}]\n", + "model_params = {'model': 'gpt-4o-mini', 'temperature': 0.4, 'max_tokens': 200}\n", + "#completion = client.chat.completions.create(messages=messages, **model_params, timeout=120)\n", + "\n", + "\n", + "#answer = completion.choices[0].message.content\n", + "#model = completion.model" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [ + { + "data": { + "text/markdown": [ + "### Question: _sesame allergies_" + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "data": { + "text/markdown": [ + "'''\n", + "# Sesame Allergies\n", + "Sesame is the 9th most common food allergen and can be found in many popular dishes, including hummus (under the name \"tahini\"). According to the FDA, sesame was added as the 9th major food allergen effective January 1, 2023, under the FASTER Act of 2021. Before this date, manufacturers were not required to list it as an allergen, although it typically appears in the ingredient statement unless it is part of a natural flavoring or spice.\n", + "\n", + "If you suspect you have a sesame allergy, it is important to consult with a board-certified allergist for proper testing and management strategies. Symptoms can range from mild reactions to severe anaphylaxis, and having an emergency action plan is crucial for those at risk.\n", + "'''" + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "query = f\"### Question: _{user_question}_\"\n", + "\n", + "from IPython.display import display, HTML, Markdown\n", + "display(Markdown(query))\n", + "display(Markdown(answer))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# 8. OpenAI assistant (LLM as a judge)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Initialize OpenAI assistant" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "assistant = client.beta.assistants.create(\n", + " name=\"Food allergies expert\",\n", + " instructions=\"You are an expert in food allergies\",\n", + " model=\"gpt-4o-mini\",\n", + " tools=[{\"type\": \"file_search\"}]\n", + " )" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Create the vector store for the PDF we are using" + ] + }, + { + "cell_type": "code", + "execution_count": 35, + "metadata": {}, + "outputs": [], + "source": [ + "# Create a vector store caled \"Datavisualization Documents\"\n", + "vector_store = client.beta.vector_stores.create(name=\"allergies_document\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Save the vectors and prepare to upload the new vector store" + ] + }, + { + "cell_type": "code", + "execution_count": 36, + "metadata": {}, + "outputs": [], + "source": [ + "file_paths = [file_path]\n", + "file_streams = [open(path, \"rb\") for path in file_paths]\n", + "\n", + "file_batch = client.beta.vector_stores.file_batches.upload_and_poll(\n", + " vector_store_id=vector_store.id, files=file_streams\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Update the OpenAI assistant with the new tool (vector store)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "assistant = client.beta.assistants.update(\n", + " assistant_id='asst_uSxFh8uPZ0nzwMShVb3J8vhF',\n", + " tool_resources={\"file_search\": {\"vector_store_ids\": [vector_store.id]}},\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": 12, + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
QuestionAnswer
0What are the most common food allergens?The most common food allergens include milk, e...
1Can you outgrow food allergies?Yes, children may outgrow allergies to milk, e...
2How is a food allergy diagnosed?Diagnosis involves a medical history review, s...
3What is anaphylaxis?Anaphylaxis is a severe, life-threatening alle...
4How can I prevent food allergies?Prevention strategies include delaying the int...
\n", + "
" + ], + "text/plain": [ + " Question \\\n", + "0 What are the most common food allergens? \n", + "1 Can you outgrow food allergies? \n", + "2 How is a food allergy diagnosed? \n", + "3 What is anaphylaxis? \n", + "4 How can I prevent food allergies? \n", + "\n", + " Answer \n", + "0 The most common food allergens include milk, e... \n", + "1 Yes, children may outgrow allergies to milk, e... \n", + "2 Diagnosis involves a medical history review, s... \n", + "3 Anaphylaxis is a severe, life-threatening alle... \n", + "4 Prevention strategies include delaying the int... " + ] + }, + "execution_count": 12, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "references = pd.read_csv('./data/References for Evaluation.csv')\n", + "references.head()" + ] + }, + { + "cell_type": "code", + "execution_count": 13, + "metadata": {}, + "outputs": [], + "source": [ + "references_questions = references['Question']\n", + "references_answers = references['Answer']\n" + ] + }, + { + "cell_type": "code", + "execution_count": 28, + "metadata": {}, + "outputs": [], + "source": [ + "references['Answer'] = [\n", + " \"The most common food allergens, often referred to as the 'Big Eight,' include milk, eggs, peanuts, tree nuts, fish, shellfish, soy, and wheat. These allergens are responsible for the majority of allergic reactions in the population. Each of these foods can provoke a range of symptoms, from mild reactions like hives to severe anaphylactic responses. It's essential for individuals with food allergies to read labels carefully and avoid these allergens to prevent adverse reactions. Awareness and education about these common allergens are crucial for managing food allergies effectively.\",\n", + " \n", + " \"Yes, some individuals can outgrow food allergies, particularly allergies to milk, eggs, soy, and wheat. Studies indicate that a significant percentage of children with these allergies may become tolerant as they age. However, allergies to peanuts, tree nuts, fish, and shellfish are less likely to be outgrown. Regular follow-ups with an allergist can help monitor changes in allergy status and determine if it’s safe to reintroduce certain foods into the diet.\",\n", + " \n", + " \"Food allergies are typically diagnosed through a combination of patient history, skin prick tests, and blood tests that measure specific IgE antibodies. An allergist may also recommend an oral food challenge, where the patient consumes the suspected allergen under medical supervision to observe for any reactions. Accurate diagnosis is crucial for effective management and to avoid unnecessary dietary restrictions.\",\n", + " \n", + " \"Anaphylaxis is a severe, potentially life-threatening allergic reaction that can occur within minutes of exposure to an allergen. Symptoms may include difficulty breathing, swelling of the throat, rapid heartbeat, and a drop in blood pressure. Immediate treatment with an epinephrine auto-injector is essential, as it can reverse the symptoms and save lives. Individuals at risk of anaphylaxis should carry an epinephrine auto-injector at all times and have an action plan in place.\",\n", + " \n", + " \"While not all food allergies can be prevented, certain strategies can reduce the risk. Introducing allergenic foods to infants at an early age, particularly for high-risk children, may help prevent allergies. It’s also important to avoid known allergens and educate family and caregivers about food allergies. Reading food labels carefully and communicating with restaurants about dietary restrictions can further help in preventing accidental exposure.\",\n", + " \n", + " \"Currently, the primary treatment for food allergies is strict avoidance of the allergenic food. In cases of accidental exposure, antihistamines can alleviate mild reactions, while epinephrine is necessary for severe reactions like anaphylaxis. Ongoing research is exploring immunotherapy options, which may help desensitize individuals to specific allergens over time, but these treatments are still under investigation.\",\n", + " \n", + " \"Yes, food allergens can remain on surfaces, utensils, and even in the air, posing a risk for cross-contamination. For example, traces of peanut butter on a knife can transfer to other foods. It’s crucial for individuals with food allergies to practice strict hygiene, including washing hands and surfaces thoroughly, to minimize the risk of accidental exposure.\",\n", + " \n", + " \"Yes, it is possible to develop food allergies as an adult, even if you have previously consumed the food without any issues. Adult-onset food allergies can be triggered by various factors, including changes in the immune system or exposure to new allergens. Symptoms may vary and can sometimes be more severe than those experienced in childhood allergies.\",\n", + " \n", + " \"Symptoms of a food allergy can vary widely and may include hives, swelling, abdominal pain, nausea, vomiting, diarrhea, and respiratory issues like wheezing or difficulty breathing. In severe cases, anaphylaxis can occur. It’s important to recognize these symptoms and seek medical attention if an allergic reaction is suspected.\",\n", + " \n", + " \"Food allergy symptoms can appear within minutes to a few hours after exposure to the allergen. In some cases, symptoms may be delayed and can take several hours to manifest, making it challenging to identify the trigger. Monitoring symptoms and keeping a food diary can help in recognizing patterns and identifying allergens.\",\n", + " \n", + " \"Oral allergy syndrome (OAS) is a condition where individuals with pollen allergies experience allergic reactions to certain raw fruits, vegetables, or nuts due to cross-reacting proteins. Symptoms typically include itching or swelling in the mouth and throat shortly after eating these foods. OAS is generally mild and resolves quickly, as the proteins involved are similar to those found in pollen.\",\n", + " \n", + " \"While many people mistakenly refer to gluten intolerance as a gluten allergy, the correct term is celiac disease, which is an autoimmune disorder. Celiac disease affects a small percentage of the population, but non-celiac gluten sensitivity is more common. Individuals with these conditions must adhere to a strict gluten-free diet to avoid symptoms and complications.\",\n", + " \n", + " \"Managing food allergies involves strict avoidance of the allergenic foods, educating oneself and others about the allergy, and having an emergency action plan in place. Carrying an epinephrine auto-injector is crucial for those at risk of anaphylaxis. Regular consultations with an allergist can help monitor the condition and provide guidance on managing allergies effectively.\",\n", + " \n", + " \"Using an epinephrine auto-injector is straightforward. First, remove the cap and hold the injector in your fist, with the tip pointing down. Place the tip against the outer thigh and press firmly until you hear a click. Hold it in place for about 3 seconds, then remove it and massage the injection site for 10 seconds. Seek emergency medical help immediately after using the injector, as further treatment may be necessary.\",\n", + " \n", + " \"The cost of food allergy testing can vary widely depending on the type of tests performed and the healthcare provider. Skin prick tests and blood tests can range from $100 to several hundred dollars. Insurance coverage may help offset some costs, but it’s essential to check with your provider beforehand. Regular follow-ups and consultations with an allergist can also contribute to overall costs.\",\n", + " \n", + " \"Yes, individuals with food allergies must adhere to strict dietary restrictions to avoid allergens. This includes reading food labels carefully, avoiding cross-contamination, and being cautious when dining out. It’s important to communicate dietary restrictions clearly to family, friends, and restaurant staff to ensure safety.\",\n", + " \n", + " \"Cross-reactive allergens occur when proteins in one substance are similar to those in another, leading to allergic reactions. For example, individuals allergic to certain pollens may also react to specific fruits and vegetables due to similar protein structures. Understanding cross-reactivity is important for managing allergies and avoiding unexpected reactions.\",\n", + " \n", + " \"Yes, food allergies can lead to gastrointestinal issues such as nausea, vomiting, abdominal pain, and diarrhea. These symptoms can occur shortly after consuming the allergenic food and may vary in severity. It’s essential to differentiate between food allergies and intolerances, as the management strategies differ.\",\n", + " \n", + " \"Precautionary labeling statements, such as 'may contain' or 'processed in a facility that handles,' are used by manufacturers to indicate potential cross-contamination with allergens. While these labels are not mandatory, they serve as a warning for individuals with food allergies. It’s crucial for consumers to take these labels seriously and avoid products that may pose a risk.\",\n", + " \n", + " \"In case of a severe allergic reaction, such as anaphylaxis, administer an epinephrine auto-injector immediately and call emergency services. Lay the person down and elevate their legs if they are feeling faint. Monitor their symptoms and be prepared to administer a second dose of epinephrine if symptoms do not improve within 5 to 15 minutes. Ensure that the person receives medical attention as soon as possible.\"\n", + "]" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "0 The most common food allergens, often referred...\n", + "1 Yes, some individuals can outgrow food allergi...\n", + "2 Food allergies are typically diagnosed through...\n", + "3 Anaphylaxis is a severe, potentially life-thre...\n", + "4 While not all food allergies can be prevented,...\n", + "5 Currently, the primary treatment for food alle...\n", + "6 Yes, food allergens can remain on surfaces, ut...\n", + "7 Yes, it is possible to develop food allergies ...\n", + "8 Symptoms of a food allergy can vary widely and...\n", + "9 Food allergy symptoms can appear within minute...\n", + "10 Oral allergy syndrome (OAS) is a condition whe...\n", + "11 While many people mistakenly refer to gluten i...\n", + "12 Managing food allergies involves strict avoida...\n", + "13 Using an epinephrine auto-injector is straight...\n", + "14 The cost of food allergy testing can vary wide...\n", + "15 Yes, individuals with food allergies must adhe...\n", + "16 Cross-reactive allergens occur when proteins i...\n", + "17 Yes, food allergies can lead to gastrointestin...\n", + "18 Precautionary labeling statements, such as 'ma...\n", + "19 In case of a severe allergic reaction, such as...\n", + "Name: Answer, dtype: object" + ] + }, + "execution_count": 29, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "new_reference_answers = references['Answer']" + ] + }, + { + "cell_type": "code", + "execution_count": 30, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "[\"```\\n# Most Common Food Allergens\\nThe most common food allergens include:\\n- Milk\\n- Eggs\\n- Peanuts\\n- Tree nuts\\n- Fish\\n- Shellfish\\n- Wheat\\n- Soy\\n- Sesame (the 9th most common food allergen)\\n\\nIn adults, additional allergens may include fruit and vegetable pollen, which can cause oral allergy syndrome. It's important to note that individuals allergic to one type of nut may also react to related nuts, and those allergic to shrimp may have reactions to crab and lobster.\\n```\", \"'''\\n# Can You Outgrow Food Allergies?\\nYes, it is possible to outgrow food allergies. Children generally, but not always, outgrow allergies to milk, egg, soy, and wheat. Research indicates that up to 25 percent of children may outgrow their peanut allergy, with slightly fewer expected to outgrow a tree nut allergy. However, food allergies that develop in adulthood tend to be lifelong, and the chances of outgrowing them are much lower.\\n'''\", \"```\\n# Food Allergy Diagnosis\\nFood allergies are diagnosed through a combination of patient history, skin-prick tests, and blood tests that measure allergen-specific immunoglobulin E (IgE) antibodies. If these tests are inconclusive, an oral food challenge may be conducted under strict medical supervision. During this challenge, the patient is given increasing doses of the suspected allergen to observe for any reactions. This method is particularly useful when the patient's history is unclear or to determine if an allergy has been outgrown. It should only be performed by experienced allergists in a controlled setting due to the risk of severe reactions.\\n```\", \"'''\\n# What is Anaphylaxis?\\nAnaphylaxis is a potentially life-threatening allergic reaction that can occur suddenly after exposure to an allergen. It is characterized by severe symptoms, including swelling of the airways, difficulty breathing, and a sudden drop in blood pressure, which can lead to dizziness or fainting. Anaphylaxis can happen within seconds or minutes of exposure and can worsen quickly, making it critical to act promptly. The first-line treatment for anaphylaxis is epinephrine (adrenaline), which is administered using an auto-injector. It is essential for individuals with known allergies to carry an epinephrine auto-injector and to be educated on its use.\\n'''\", \"'''\\n# Preventing Food Allergies\\nTo potentially prevent food allergies, the American Academy of Pediatrics recommends the following strategies:\\n\\n1. **Timing of Solid Foods**: Introduce solid foods to babies no earlier than 17 weeks of age. This may help reduce the risk of developing allergies.\\n\\n2. **Breastfeeding**: Exclusively breast-feed infants for as long as possible, as this may provide some protective benefits against allergies.\\n\\n3. **Introducing Allergenic Foods**: For high-risk infants (those with a strong family history of allergic diseases), it may be beneficial to introduce peanut-containing foods as early as 4-6 months, after ensuring it is safe to do so. However, whole peanuts should never be given to infants due to choking hazards.\\n\\nIt is important to consult with a healthcare provider for personalized advice, especially for high-risk infants. \\n'''\", \"```\\n# Treatments for Food Allergies\\nThe primary way to manage a food allergy is to avoid consuming the food that causes the reaction. This involves carefully checking ingredient labels and learning about any alternative names for the allergens.\\n\\nIn addition to avoidance, other treatments may include:\\n\\n1. **Emergency Medication**: Epinephrine is the only medication that can reverse life-threatening symptoms of anaphylaxis. It is crucial for individuals with severe allergies to carry an epinephrine auto-injector.\\n\\n2. **Symptomatic Treatment**: Antihistamines and corticosteroids may be prescribed to treat symptoms of a food allergy, but they do not replace the need for epinephrine.\\n\\n3. **Emergency Action Plans**: For children, it's important to have a written emergency action plan in place at schools or daycare facilities, detailing how to prevent, recognize, and manage allergic reactions.\\n\\nConsulting with an allergist for personalized advice and management strategies is also recommended.\\n```\", \"'''\\n# Can Food Allergens Remain on Objects?\\nYes, food allergens can potentially remain on objects if they are not carefully cleaned. Touching an object that contains an allergen may cause a skin rash at the site of contact, but it is highly unlikely to result in a severe allergic reaction without ingestion. Washing the area with soap and water can effectively remove the allergen, while gel-based alcohol hand sanitizers will not. It is a common misconception that touching an allergen can lead to severe reactions.\\n'''\", \"'''\\n# Can Adults Develop Food Allergies?\\nYes, food allergies can develop in adults, although it is rare. Most food allergies typically develop in childhood, but some adults may experience new allergies. The most common food allergies in adults include shellfish (both crustaceans and mollusks), tree nuts, peanuts, and fish. Unlike children, adults are less likely to outgrow these allergies, and they tend to be lifelong.\\n'''\", '```\\n# Symptoms of Food Allergy\\nSymptoms of a food allergy can involve multiple systems of the body and may include:\\n\\n- Skin: Hives or skin rash\\n- Gastrointestinal Tract: Nausea, stomach cramps, vomiting, diarrhea\\n- Respiratory Tract: Stuffy or runny nose, sneezing, shortness of breath, wheezing, repetitive cough, tight or hoarse throat\\n- Cardiovascular System: Weak pulse, pale or blue coloring of the skin, dizziness or feeling faint\\n\\nIn severe cases, anaphylaxis may occur, which is a life-threatening reaction that can impair breathing and cause shock. Symptoms of anaphylaxis can include swelling of the tongue, trouble swallowing, and shock or circulatory collapse.\\n\\nIf you suspect a food allergy, it is important to consult an allergist for proper diagnosis and management.\\n```', '```\\n# Onset of Food Allergy Symptoms\\nMost food-related symptoms typically occur within two hours of ingestion, often starting within minutes. In rare cases, reactions may be delayed by four to six hours or longer. Delayed reactions are more commonly observed in children with eczema related to food allergies and in individuals with a rare allergy to red meat caused by the bite of a lone star tick.\\n```', \"'''\\n# Oral Allergy Syndrome\\nOral allergy syndrome, also known as pollen-food syndrome, is a condition where individuals experience allergic reactions to certain raw fruits, vegetables, and some tree nuts due to cross-reacting allergens found in pollen. Symptoms typically include an itchy mouth or tongue and swelling of the lips or tongue after consuming these foods. It is important to note that this syndrome is not a true food allergy; rather, it is a reaction to pollen proteins that are similar to those found in certain foods. The symptoms are usually short-lived, as the allergens are quickly digested, and cooking the food often eliminates the allergic response.\\n'''\", \"'''\\n# Gluten Allergy Commonality\\nThere is no such thing as a gluten allergy; this term is often confused with wheat allergy or celiac disease. While many people may label themselves as “allergic” to gluten, it is important to consult a specialist for a proper diagnosis. Celiac disease, which is a serious digestive condition triggered by gluten, is the relevant condition to consider. If you have concerns about gluten intolerance or celiac disease, it is advisable to see a primary care provider or a gastroenterologist.\\n'''\", '```\\n# Managing Food Allergies\\nThe primary way to manage a food allergy is to avoid consuming the food that causes you problems. Here are some key strategies:\\n\\n1. **Avoidance**: Carefully check ingredient labels of food products to ensure they do not contain allergens. Learn whether the foods you need to avoid are known by other names.\\n\\n2. **Label Awareness**: The Food Allergy Labeling and Consumer Protection Act of 2004 (FALCPA) mandates that manufacturers identify the presence of the eight most common food allergens (milk, egg, wheat, soy, peanut, tree nut, fish, and crustacean shellfish) in their products.\\n\\n3. **Emergency Action Plan**: If you or your child has a food allergy, especially to shellfish, ensure that schools or care facilities have a written emergency action plan for preventing and managing allergic reactions.\\n\\n4. **Auto-Injectors**: If prescribed, make sure you and those responsible for your care understand how to', '```\\n# Using an Epinephrine Auto-Injector\\nTo use an epinephrine auto-injector, follow these steps:\\n\\n1. **Remove the Auto-Injector**: Take the auto-injector out of its case and check the expiration date to ensure it is still valid.\\n\\n2. **Hold the Auto-Injector**: Grip the auto-injector firmly with your dominant hand, with your thumb on the bottom and your fingers around the top. Do not touch the orange tip.\\n\\n3. **Position the Injector**: Place the orange tip against the outer thigh, at a 90-degree angle to the thigh. It can be administered through clothing if necessary.\\n\\n4. **Inject**: Press down firmly until you hear a click. Hold it in place for about 3 seconds to ensure the medication is delivered.\\n\\n5. **Remove the Injector**: After the injection, remove the auto-injector from the thigh and massage the injection site for about 10 seconds.\\n\\n', \"'''\\n# Cost of Food Allergy Testing\\nThe cost of food allergy testing can vary widely, as there isn't a uniform price for these medical procedures. Factors such as the type of test conducted (skin prick tests or blood tests), the specific foods being tested, and insurance coverage can all influence the final cost. It's advisable to consult with an allergist for a more accurate estimate based on your specific situation and needs.\\n'''\", '```\\n# Dietary Restrictions for Allergens\\nYes, individuals with food allergies must adhere to dietary restrictions to avoid allergens. It is essential to understand how to read ingredient labels and practice avoidance measures. This includes avoiding foods that contain known allergens and being cautious of products with precautionary statements like “may contain” or “made in a shared facility.” Consulting with a board-certified allergist can provide personalized guidance on safe foods and meal planning.\\n```', \"'''\\n# Cross-Reactive Allergens\\nCross-reactive allergens are substances that can trigger allergic reactions in individuals who are already allergic to a specific food or allergen due to similarities in their protein structures. For example, a person allergic to one type of tree nut may also react to other tree nuts, and someone allergic to shrimp may experience reactions to crab and lobster. Additionally, individuals allergic to peanuts, which are legumes, may have issues with tree nuts like pecans, walnuts, and almonds. Understanding cross-reactivity is important for managing food allergies and should be discussed with a board-certified allergist.\\n'''\", '```\\n# Food Allergies and Gastrointestinal Issues\\nYes, food allergies can cause gastrointestinal issues. Symptoms may include digestive problems such as cramps, nausea, or vomiting. It is important to consult an allergist to determine the specific food allergy and manage symptoms effectively.\\n```', \"'''\\n# Precautionary Labeling Statements\\nPrecautionary labeling statements are indications on food packaging that suggest potential allergen contamination. Common examples include phrases such as “may contain,” “might contain,” “made on shared equipment,” or “made in a shared facility.” These statements are not legally required and lack standardized definitions, so it is essential for individuals with allergies to read labels carefully and consult with an allergist if they have questions about food safety.\\n'''\", '```\\n# Immediate Action for Severe Allergic Reaction\\nIf you experience a severe allergic reaction, administer epinephrine immediately if you have symptoms such as shortness of breath, weak pulse, generalized hives, or tightness in the throat. After using the epinephrine auto-injector, call 911 and inform the dispatcher that you used epinephrine and that more may be needed. It is important to have two doses available, as severe reactions may recur. Remember, if you are uncertain whether to use epinephrine, it is better to use it right away, as the benefits outweigh the risks.\\n```']\n" + ] + } + ], + "source": [ + "# choose the question\n", + "generated_answers = []\n", + "for question in references_questions:\n", + " docs = db.similarity_search(question, k=3)\n", + " prompt = generate_prompt(question, docs)\n", + " messages = [{'role':'user', 'content':prompt}]\n", + " completion = client.chat.completions.create(messages=messages, **model_params, timeout=120)\n", + " answer = completion.choices[0].message.content\n", + " generated_answers.append(answer)\n", + "\n", + "print(generated_answers)" + ] + }, + { + "cell_type": "code", + "execution_count": 31, + "metadata": {}, + "outputs": [], + "source": [ + "references['Generated Answers'] = generated_answers" + ] + }, + { + "cell_type": "code", + "execution_count": 18, + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
QuestionAnswerGenerated Answers
0What are the most common food allergens?The most common food allergens include milk, e...```\\n# Most Common Food Allergens\\nThe most co...
1Can you outgrow food allergies?Yes, children may outgrow allergies to milk, e...```\\n# Can You Outgrow Food Allergies?\\nYes, i...
2How is a food allergy diagnosed?Diagnosis involves a medical history review, s...```\\n# Diagnosis of Food Allergy\\nA food aller...
3What is anaphylaxis?Anaphylaxis is a severe, life-threatening alle...```\\n# What is Anaphylaxis?\\nAnaphylaxis is a ...
4How can I prevent food allergies?Prevention strategies include delaying the int...```\\n# Preventing Food Allergies\\nPreventing f...
\n", + "
" + ], + "text/plain": [ + " Question \\\n", + "0 What are the most common food allergens? \n", + "1 Can you outgrow food allergies? \n", + "2 How is a food allergy diagnosed? \n", + "3 What is anaphylaxis? \n", + "4 How can I prevent food allergies? \n", + "\n", + " Answer \\\n", + "0 The most common food allergens include milk, e... \n", + "1 Yes, children may outgrow allergies to milk, e... \n", + "2 Diagnosis involves a medical history review, s... \n", + "3 Anaphylaxis is a severe, life-threatening alle... \n", + "4 Prevention strategies include delaying the int... \n", + "\n", + " Generated Answers \n", + "0 ```\\n# Most Common Food Allergens\\nThe most co... \n", + "1 ```\\n# Can You Outgrow Food Allergies?\\nYes, i... \n", + "2 ```\\n# Diagnosis of Food Allergy\\nA food aller... \n", + "3 ```\\n# What is Anaphylaxis?\\nAnaphylaxis is a ... \n", + "4 ```\\n# Preventing Food Allergies\\nPreventing f... " + ] + }, + "execution_count": 18, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "references.head()" + ] + }, + { + "cell_type": "code", + "execution_count": 21, + "metadata": {}, + "outputs": [ + { + "data": { + "application/vnd.jupyter.widget-view+json": { + "model_id": "8c6ce2bfa7d842d3b1bd30a3b8322083", + "version_major": 2, + "version_minor": 0 + }, + "text/plain": [ + "Downloading builder script: 0%| | 0.00/6.27k [00:00 +body { +background-image: url("https://images.unsplash.com/photo-1542281286-9e0a16bb7366"); +background-size: cover; +} + +''' + + + +st.markdown(page_bg_img, unsafe_allow_html=True) + +st.title("Food Allergies App") +st.write('This app will tell you all you need to know about food allergies!') + +st.header("Ask a question about food allergies") +#user_query = st.text_input("type your question here", "e.g.: List all the food allergies!") +with st.form('my_form'): + uploaded_file = st.file_uploader("Upload a document", type=["pdf", "docx"]) + + user_query = st.text_area('type your question here', 'e.g.: List all the food allergies!') + submitted = st.form_submit_button('Submit') + if submitted: + st.write('Generating response...') + ##### Check if the vector store exists ##### + db = get_or_create_vectorstore(embeddings) + + query = user_query + + ##### Get response from LLM ##### + answer = get_response(db, query) # this line generates the dynamic promtp for RAG, and calls the LLm for an answer + + ##### Evaluate response ##### + st.write('Evaluating response...') + prompt_for_eval = generate_prompt_for_eval(query, answer) + evaluation = get_evaluation_from_LLM_as_a_judge(client, prompt_for_eval) + + ## 5. Display answer + dash_line = '------------------------' + st.write(dash_line) + + st.write(f"My Question: {query}") + st.write(dash_line) + st.markdown(answer) + st.write(dash_line) + st.write("Evaluation from LLM") + st.markdown(evaluation) + + + + + + + +