From d49d348af3c62ad77435a4569a881d5f835d88bb Mon Sep 17 00:00:00 2001 From: ChengZi Date: Tue, 25 Jun 2024 15:24:08 +0800 Subject: [PATCH] add metadata filtering for lc integration Signed-off-by: ChengZi --- .../rag_with_milvus_and_langchain.ipynb | 314 +++++++++++++++--- 1 file changed, 273 insertions(+), 41 deletions(-) diff --git a/bootcamp/tutorials/integration/rag_with_milvus_and_langchain.ipynb b/bootcamp/tutorials/integration/rag_with_milvus_and_langchain.ipynb index 4f02a3ee0..53217b53a 100644 --- a/bootcamp/tutorials/integration/rag_with_milvus_and_langchain.ipynb +++ b/bootcamp/tutorials/integration/rag_with_milvus_and_langchain.ipynb @@ -2,13 +2,15 @@ "cells": [ { "cell_type": "markdown", - "source": [ - "\"Open\n" - ], "metadata": { - "collapsed": false + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } }, - "outputs": [] + "source": [ + "\"Open\n" + ] }, { "cell_type": "markdown", @@ -40,17 +42,20 @@ "metadata": {}, "outputs": [], "source": [ - "! pip install --upgrade --quiet langchain langchain-core langchain-community langchain-text-splitters langchain-milvus langchain-openai bs4" + "# ! pip install --upgrade --quiet langchain langchain-core langchain-community langchain-text-splitters langchain-milvus langchain-openai bs4" ] }, { "cell_type": "markdown", + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, "source": [ "> If you are using Google Colab, to enable dependencies just installed, you may need to **restart the runtime** (click on the \"Runtime\" menu at the top of the screen, and select \"Restart session\" from the dropdown menu)." - ], - "metadata": { - "collapsed": false - } + ] }, { "cell_type": "markdown", @@ -66,19 +71,22 @@ }, { "cell_type": "code", - "execution_count": null, - "outputs": [], - "source": [ - "import os\n", - "\n", - "os.environ[\"OPENAI_API_KEY\"] = \"sk-***********\"" - ], + "execution_count": 2, "metadata": { "collapsed": false, + "jupyter": { + "outputs_hidden": false + }, "pycharm": { "name": "#%%\n" } - } + }, + "outputs": [], + "source": [ + "import os\n", + "\n", + "os.environ[\"OPENAI_API_KEY\"] = \"sk-***********\"" + ] }, { "cell_type": "markdown", @@ -97,7 +105,7 @@ }, { "cell_type": "code", - "execution_count": 2, + "execution_count": 3, "metadata": { "collapsed": false, "jupyter": { @@ -114,7 +122,7 @@ "Document(page_content='Fig. 1. Overview of a LLM-powered autonomous agent system.\\nComponent One: Planning#\\nA complicated task usually involves many steps. An agent needs to know what they are and plan ahead.\\nTask Decomposition#\\nChain of thought (CoT; Wei et al. 2022) has become a standard prompting technique for enhancing model performance on complex tasks. The model is instructed to “think step by step” to utilize more test-time computation to decompose hard tasks into smaller and simpler steps. CoT transforms big tasks into multiple manageable tasks and shed lights into an interpretation of the model’s thinking process.\\nTree of Thoughts (Yao et al. 2023) extends CoT by exploring multiple reasoning possibilities at each step. It first decomposes the problem into multiple thought steps and generates multiple thoughts per step, creating a tree structure. The search process can be BFS (breadth-first search) or DFS (depth-first search) with each state evaluated by a classifier (via a prompt) or majority vote.\\nTask decomposition can be done (1) by LLM with simple prompting like \"Steps for XYZ.\\\\n1.\", \"What are the subgoals for achieving XYZ?\", (2) by using task-specific instructions; e.g. \"Write a story outline.\" for writing a novel, or (3) with human inputs.\\nAnother quite distinct approach, LLM+P (Liu et al. 2023), involves relying on an external classical planner to do long-horizon planning. This approach utilizes the Planning Domain Definition Language (PDDL) as an intermediate interface to describe the planning problem. In this process, LLM (1) translates the problem into “Problem PDDL”, then (2) requests a classical planner to generate a PDDL plan based on an existing “Domain PDDL”, and finally (3) translates the PDDL plan back into natural language. Essentially, the planning step is outsourced to an external tool, assuming the availability of domain-specific PDDL and a suitable planner which is common in certain robotic setups but not in many other domains.\\nSelf-Reflection#', metadata={'source': 'https://lilianweng.github.io/posts/2023-06-23-agent/'})" ] }, - "execution_count": 2, + "execution_count": 3, "metadata": {}, "output_type": "execute_result" } @@ -126,7 +134,10 @@ "\n", "# Create a WebBaseLoader instance to load documents from web sources\n", "loader = WebBaseLoader(\n", - " web_paths=(\"https://lilianweng.github.io/posts/2023-06-23-agent/\",),\n", + " web_paths=(\n", + " \"https://lilianweng.github.io/posts/2023-06-23-agent/\",\n", + " \"https://lilianweng.github.io/posts/2023-03-15-prompt-engineering/\",\n", + " ),\n", " bs_kwargs=dict(\n", " parse_only=bs4.SoupStrainer(\n", " class_=(\"post-content\", \"post-title\", \"post-header\")\n", @@ -173,15 +184,15 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 4, "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false }, "pycharm": { - "name": "#%%\n", - "is_executing": true + "is_executing": true, + "name": "#%%\n" } }, "outputs": [], @@ -203,15 +214,18 @@ }, { "cell_type": "markdown", + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, "source": [ "> For the `connection_args`:\n", "> - Setting the `uri` as a local file, e.g.`./milvus.db`, is the most convenient method, as it automatically utilizes [Milvus Lite](https://milvus.io/docs/milvus_lite.md) to store all data in this file.\n", "> - If you have large scale of data, you can set up a more performant Milvus server on [docker or kubernetes](https://milvus.io/docs/quickstart.md). In this setup, please use the server uri, e.g.`http://localhost:19530`, as your `uri`.\n", "> - If you want to use [Zilliz Cloud](https://zilliz.com/cloud), the fully managed cloud service for Milvus, replace `Milvus.from_documents` with `Zilliz.from_documents`, and adjust the `uri` and `token`, which correspond to the [Public Endpoint and Api key](https://docs.zilliz.com/docs/on-zilliz-cloud-console#free-cluster-details) in Zilliz Cloud." - ], - "metadata": { - "collapsed": false - } + ] }, { "cell_type": "markdown", @@ -222,12 +236,12 @@ } }, "source": [ - "Search the documents in the Milvus vector store using a test query question. We will get the top 3 documents." + "Search the documents in the Milvus vector store using a test query question. Let's take a look at the top 1 document." ] }, { "cell_type": "code", - "execution_count": 4, + "execution_count": 5, "metadata": { "collapsed": false, "jupyter": { @@ -241,24 +255,22 @@ { "data": { "text/plain": [ - "[Document(page_content='Self-Reflection#\\nSelf-reflection is a vital aspect that allows autonomous agents to improve iteratively by refining past action decisions and correcting previous mistakes. It plays a crucial role in real-world tasks where trial and error are inevitable.\\nReAct (Yao et al. 2023) integrates reasoning and acting within LLM by extending the action space to be a combination of task-specific discrete actions and the language space. The former enables LLM to interact with the environment (e.g. use Wikipedia search API), while the latter prompting LLM to generate reasoning traces in natural language.\\nThe ReAct prompt template incorporates explicit steps for LLM to think, roughly formatted as:\\nThought: ...\\nAction: ...\\nObservation: ...\\n... (Repeated many times)', metadata={'source': 'https://lilianweng.github.io/posts/2023-06-23-agent/', 'pk': 449757808726900738}),\n", - " Document(page_content='Fig. 2. Examples of reasoning trajectories for knowledge-intensive tasks (e.g. HotpotQA, FEVER) and decision-making tasks (e.g. AlfWorld Env, WebShop). (Image source: Yao et al. 2023).\\nIn both experiments on knowledge-intensive tasks and decision-making tasks, ReAct works better than the Act-only baseline where Thought: … step is removed.\\nReflexion (Shinn & Labash 2023) is a framework to equips agents with dynamic memory and self-reflection capabilities to improve reasoning skills. Reflexion has a standard RL setup, in which the reward model provides a simple binary reward and the action space follows the setup in ReAct where the task-specific action space is augmented with language to enable complex reasoning steps. After each action $a_t$, the agent computes a heuristic $h_t$ and optionally may decide to reset the environment to start a new trial depending on the self-reflection results.\\n\\nFig. 3. Illustration of the Reflexion framework. (Image source: Shinn & Labash, 2023)\\nThe heuristic function determines when the trajectory is inefficient or contains hallucination and should be stopped. Inefficient planning refers to trajectories that take too long without success. Hallucination is defined as encountering a sequence of consecutive identical actions that lead to the same observation in the environment.\\nSelf-reflection is created by showing two-shot examples to LLM and each example is a pair of (failed trajectory, ideal reflection for guiding future changes in the plan). Then reflections are added into the agent’s working memory, up to three, to be used as context for querying LLM.', metadata={'source': 'https://lilianweng.github.io/posts/2023-06-23-agent/', 'pk': 449757808726900739}),\n", - " Document(page_content='Fig. 1. Overview of a LLM-powered autonomous agent system.\\nComponent One: Planning#\\nA complicated task usually involves many steps. An agent needs to know what they are and plan ahead.\\nTask Decomposition#\\nChain of thought (CoT; Wei et al. 2022) has become a standard prompting technique for enhancing model performance on complex tasks. The model is instructed to “think step by step” to utilize more test-time computation to decompose hard tasks into smaller and simpler steps. CoT transforms big tasks into multiple manageable tasks and shed lights into an interpretation of the model’s thinking process.\\nTree of Thoughts (Yao et al. 2023) extends CoT by exploring multiple reasoning possibilities at each step. It first decomposes the problem into multiple thought steps and generates multiple thoughts per step, creating a tree structure. The search process can be BFS (breadth-first search) or DFS (depth-first search) with each state evaluated by a classifier (via a prompt) or majority vote.\\nTask decomposition can be done (1) by LLM with simple prompting like \"Steps for XYZ.\\\\n1.\", \"What are the subgoals for achieving XYZ?\", (2) by using task-specific instructions; e.g. \"Write a story outline.\" for writing a novel, or (3) with human inputs.\\nAnother quite distinct approach, LLM+P (Liu et al. 2023), involves relying on an external classical planner to do long-horizon planning. This approach utilizes the Planning Domain Definition Language (PDDL) as an intermediate interface to describe the planning problem. In this process, LLM (1) translates the problem into “Problem PDDL”, then (2) requests a classical planner to generate a PDDL plan based on an existing “Domain PDDL”, and finally (3) translates the PDDL plan back into natural language. Essentially, the planning step is outsourced to an external tool, assuming the availability of domain-specific PDDL and a suitable planner which is common in certain robotic setups but not in many other domains.\\nSelf-Reflection#', metadata={'source': 'https://lilianweng.github.io/posts/2023-06-23-agent/', 'pk': 449757808726900737})]" + "[Document(page_content='Self-Reflection#\\nSelf-reflection is a vital aspect that allows autonomous agents to improve iteratively by refining past action decisions and correcting previous mistakes. It plays a crucial role in real-world tasks where trial and error are inevitable.\\nReAct (Yao et al. 2023) integrates reasoning and acting within LLM by extending the action space to be a combination of task-specific discrete actions and the language space. The former enables LLM to interact with the environment (e.g. use Wikipedia search API), while the latter prompting LLM to generate reasoning traces in natural language.\\nThe ReAct prompt template incorporates explicit steps for LLM to think, roughly formatted as:\\nThought: ...\\nAction: ...\\nObservation: ...\\n... (Repeated many times)', metadata={'source': 'https://lilianweng.github.io/posts/2023-06-23-agent/', 'pk': 449281835035555859})]" ] }, - "execution_count": 4, + "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "query = \"What is self-reflection of an AI Agent?\"\n", - "vectorstore.similarity_search(query, k=3)" + "vectorstore.similarity_search(query, k=1)" ] }, { "cell_type": "code", - "execution_count": 5, + "execution_count": 6, "metadata": { "collapsed": false, "jupyter": { @@ -322,7 +334,7 @@ }, { "cell_type": "code", - "execution_count": 6, + "execution_count": 7, "metadata": { "collapsed": false, "jupyter": { @@ -336,10 +348,10 @@ { "data": { "text/plain": [ - "\"Self-reflection of an AI agent involves the process of synthesizing memories into higher-level inferences over time to guide the agent's future behavior. It includes recency, importance, and relevance factors to determine the salient high-level questions and optimize believability. This mechanism helps the agent improve iteratively by refining past action decisions and correcting previous mistakes.\"" + "\"Self-reflection of an AI agent involves the process of synthesizing memories into higher-level inferences over time to guide the agent's future behavior. It serves as a mechanism to create higher-level summaries of past events. One approach to self-reflection involves prompting the language model with the 100 most recent observations and asking it to generate the 3 most salient high-level questions based on those observations. This process helps the AI agent optimize believability in the current moment and over time.\"" ] }, - "execution_count": 6, + "execution_count": 7, "metadata": {}, "output_type": "execute_result" } @@ -362,15 +374,235 @@ }, { "cell_type": "markdown", + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, "source": [ - "This tutorial build a vanilla RAG chain powered by Milvus. For more advanced RAG techniques, please refer to the [advanced rag bootcamp](https://github.com/milvus-io/bootcamp/tree/master/bootcamp/RAG/advanced_rag)." + "Congratulations! You have built a basic RAG chain powered by Milvus and LangChain." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "source": [ + "## Metadata filtering\n", + "\n", + "We can use the [Milvus Scalar Filtering Rules](https://milvus.io/docs/boolean.md) to filter the documents based on metadata. We have loaded the documents from two different sources, and we can filter the documents by the metadata `source`." + ] + }, + { + "cell_type": "code", + "execution_count": 8, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "[Document(page_content='Fig. 1. Overview of a LLM-powered autonomous agent system.\\nComponent One: Planning#\\nA complicated task usually involves many steps. An agent needs to know what they are and plan ahead.\\nTask Decomposition#\\nChain of thought (CoT; Wei et al. 2022) has become a standard prompting technique for enhancing model performance on complex tasks. The model is instructed to “think step by step” to utilize more test-time computation to decompose hard tasks into smaller and simpler steps. CoT transforms big tasks into multiple manageable tasks and shed lights into an interpretation of the model’s thinking process.\\nTree of Thoughts (Yao et al. 2023) extends CoT by exploring multiple reasoning possibilities at each step. It first decomposes the problem into multiple thought steps and generates multiple thoughts per step, creating a tree structure. The search process can be BFS (breadth-first search) or DFS (depth-first search) with each state evaluated by a classifier (via a prompt) or majority vote.\\nTask decomposition can be done (1) by LLM with simple prompting like \"Steps for XYZ.\\\\n1.\", \"What are the subgoals for achieving XYZ?\", (2) by using task-specific instructions; e.g. \"Write a story outline.\" for writing a novel, or (3) with human inputs.\\nAnother quite distinct approach, LLM+P (Liu et al. 2023), involves relying on an external classical planner to do long-horizon planning. This approach utilizes the Planning Domain Definition Language (PDDL) as an intermediate interface to describe the planning problem. In this process, LLM (1) translates the problem into “Problem PDDL”, then (2) requests a classical planner to generate a PDDL plan based on an existing “Domain PDDL”, and finally (3) translates the PDDL plan back into natural language. Essentially, the planning step is outsourced to an external tool, assuming the availability of domain-specific PDDL and a suitable planner which is common in certain robotic setups but not in many other domains.\\nSelf-Reflection#', metadata={'source': 'https://lilianweng.github.io/posts/2023-06-23-agent/', 'pk': 449281835035555858})]" + ] + }, + "execution_count": 8, + "metadata": {}, + "output_type": "execute_result" + } ], + "source": [ + "vectorstore.similarity_search(\n", + " \"What is CoT?\",\n", + " k=1,\n", + " expr=\"source == 'https://lilianweng.github.io/posts/2023-06-23-agent/'\",\n", + ")\n", + "\n", + "# The same as:\n", + "# vectorstore.as_retriever(search_kwargs=dict(\n", + "# k=1,\n", + "# expr=\"source == 'https://lilianweng.github.io/posts/2023-06-23-agent/'\",\n", + "# )).invoke(\"What is CoT?\")" + ] + }, + { + "cell_type": "markdown", "metadata": { "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "source": [ + "If we want to dynamically change the search parameters without rebuilding the chain, we can [configure the runtime chain internals](https://python.langchain.com/v0.2/docs/how_to/configure/) . Let's define a new retriever with this dynamically configure and use it to build a new RAG chain." + ] + }, + { + "cell_type": "code", + "execution_count": 9, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "[Document(page_content='Self-Reflection#\\nSelf-reflection is a vital aspect that allows autonomous agents to improve iteratively by refining past action decisions and correcting previous mistakes. It plays a crucial role in real-world tasks where trial and error are inevitable.\\nReAct (Yao et al. 2023) integrates reasoning and acting within LLM by extending the action space to be a combination of task-specific discrete actions and the language space. The former enables LLM to interact with the environment (e.g. use Wikipedia search API), while the latter prompting LLM to generate reasoning traces in natural language.\\nThe ReAct prompt template incorporates explicit steps for LLM to think, roughly formatted as:\\nThought: ...\\nAction: ...\\nObservation: ...\\n... (Repeated many times)', metadata={'source': 'https://lilianweng.github.io/posts/2023-06-23-agent/', 'pk': 449281835035555859})]" + ] + }, + "execution_count": 9, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "from langchain_core.runnables import ConfigurableField\n", + "\n", + "# Define a new retriever with a configurable field for search_kwargs\n", + "retriever2 = vectorstore.as_retriever().configurable_fields(\n", + " search_kwargs=ConfigurableField(\n", + " id=\"retriever_search_kwargs\",\n", + " )\n", + ")\n", + "\n", + "# Invoke the retriever with a specific search_kwargs which filter the documents by source\n", + "retriever2.with_config(\n", + " configurable={\n", + " \"retriever_search_kwargs\": dict(\n", + " expr=\"source == 'https://lilianweng.github.io/posts/2023-06-23-agent/'\",\n", + " k=1,\n", + " )\n", + " }\n", + ").invoke(query)" + ] + }, + { + "cell_type": "code", + "execution_count": 10, + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + }, + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "# Define a new RAG chain with this dynamically configurable retriever\n", + "rag_chain2 = (\n", + " {\"context\": retriever2 | format_docs, \"question\": RunnablePassthrough()}\n", + " | prompt\n", + " | llm\n", + " | StrOutputParser()\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "source": [ + "Let's try this dynamically configurable RAG chain with different filter conditions." + ] + }, + { + "cell_type": "code", + "execution_count": 11, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "\"Self-reflection of an AI agent involves the process of synthesizing memories into higher-level inferences over time to guide the agent's future behavior. It serves as a mechanism to create higher-level summaries of past events. One approach to self-reflection involves prompting the language model with the 100 most recent observations and asking it to generate the 3 most salient high-level questions based on those observations. This process helps the AI agent optimize believability in the current moment and over time.\"" + ] + }, + "execution_count": 11, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Invoke this RAG chain with a specific question and config\n", + "rag_chain2.with_config(\n", + " configurable={\n", + " \"retriever_search_kwargs\": dict(\n", + " expr=\"source == 'https://lilianweng.github.io/posts/2023-06-23-agent/'\",\n", + " )\n", + " }\n", + ").invoke(query)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "source": [ + "When we change the search condition to filter the documents by the second source, as the content of this blog source has nothing todo with the query question, we get an answer with no relevant information." + ] + }, + { + "cell_type": "code", + "execution_count": 12, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "\"I'm sorry, but based on the provided context, there is no specific information or statistical data available regarding the self-reflection of an AI agent.\"" + ] + }, + "execution_count": 12, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "rag_chain2.with_config(\n", + " configurable={\n", + " \"retriever_search_kwargs\": dict(\n", + " expr=\"source == 'https://lilianweng.github.io/posts/2023-03-15-prompt-engineering/'\",\n", + " )\n", + " }\n", + ").invoke(query)" + ] + }, + { + "cell_type": "markdown", + "metadata": { "pycharm": { "name": "#%% md\n" } - } + }, + "source": [ + "----\n", + "This tutorial focus the basic usage of Milvus LangChain integration and simple RAG approach. For more advanced RAG techniques, please refer to the [advanced rag bootcamp](https://github.com/milvus-io/bootcamp/tree/master/bootcamp/RAG/advanced_rag)." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + }, + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [] } ], "metadata": {