diff --git a/notebooks/generative-ai-with-vertex/meta.toml b/notebooks/generative-ai-with-vertex/meta.toml new file mode 100644 index 00000000..9703717f --- /dev/null +++ b/notebooks/generative-ai-with-vertex/meta.toml @@ -0,0 +1,9 @@ +[meta] +title="Building a Generative AI Application with Vertex AI and SingleStoreDB" +description="""\ + Learn to build an AI application using Google Cloud's Vertex AI + and SingleStoreDB. + """ +icon="crystal-ball" +tags=["ai"] +destinations=["spaces"] diff --git a/notebooks/generative-ai-with-vertex/notebook.ipynb b/notebooks/generative-ai-with-vertex/notebook.ipynb new file mode 100644 index 00000000..460312d6 --- /dev/null +++ b/notebooks/generative-ai-with-vertex/notebook.ipynb @@ -0,0 +1,240 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "53da8269-44c3-4299-9a8e-f911beb5661e", + "metadata": {}, + "source": [ + "
\n", + "
\n", + " \n", + "
\n", + "
\n", + "
SingleStore Notebooks
\n", + "

Building a Generative AI Application with Vertex AI and SingleStoreDB

\n", + "
\n", + "
" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Document Ingestion\n", + "\n", + "Welcome to this guide on building a state-of-the-art General AI application using Google Cloud's Vertex AI and SingleStoreDB. This guide aims to provide a seamless experience, offering step-by-step instructions, code explanations, and best practices.\n", + "\n", + "## Overview\n", + "\n", + "Vertex AI, a product by Google Cloud, offers an integrated suite of machine learning tools that allows developers to build, deploy, and scale AI models faster than ever. On the other hand, SingleStoreDB offers a fast, scalable, and SQL-compliant relational database system. By combining the power of Vertex AI's machine learning capabilities with the efficient storage and retrieval mechanisms of SingleStoreDB, we can create robust AI applications that respond to user queries in real-time.\n", + "\n", + "### What You'll Learn\n", + "\n", + "- Setting up your environment with the necessary packages and credentials.\n", + "- Fetching and processing data to be used in our AI models.\n", + "- Storing and managing data efficiently using SingleStoreDB.\n", + "- Leveraging the power of Vertex AI for real-time data processing and insights.\n", + "- Building a retrieval-based QA system to answer user queries.\n", + "\n", + "### Prerequisites\n", + "\n", + "- Basic knowledge of Python programming.\n", + "- Familiarity with Google Cloud services and SQL databases.\n", + "- An active Google Cloud account.\n", + "- A SingleStoreDB hosted or self-managed instance.\n", + "\n", + "**Let's dive in and start building!**" + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "metadata": {}, + "outputs": [], + "source": [ + "%pip install --quiet google-cloud-aiplatform langchain github-clone\n", + "%pip install --quiet unstructured unstructured[pdf] pytesseract\n", + "%pip install --quiet singlestoredb" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Authentication\n", + "\n", + "The next step involves authenticating our session with Google Cloud. By running the following cell, you'll be prompted to log in using your Google Cloud credentials. Follow the instructions to complete the login process." + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "metadata": {}, + "outputs": [], + "source": [ + "from google.colab import auth as google_auth\n", + "\n", + "google_auth.authenticate_user()" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Import modules" + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "metadata": {}, + "outputs": [], + "source": [ + "# Vertex AI\n", + "import vertexai\n", + "from google.cloud import aiplatform\n", + "from vertexai.language_models import TextEmbeddingModel, TextGenerationModel\n", + "\n", + "# Langchain\n", + "from langchain.llms import VertexAI\n", + "from langchain.vectorstores import SingleStoreDB" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Obtaining a dataset\n", + "\n", + "The following is a dataset composed by public data provided by the IRS regarding the 2023 tax season.\n", + "\n", + "You can download the dataset to your computer and explore it by following [this link](https://drive.google.com/file/d/1mdDHBnSWwDbMo2xyRk9gxUAswhyb9uKw/view?usp=drive_link).\n", + "\n", + "After the dataset is downloaded, the contents will be ingested into SingleStore.\n", + "\n", + "The Document processing includes chunking the documents leveraging Langchain's chunking libraries, and generating embeddings using the Google PaLM 2 text-gecko-001 model." + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "metadata": {}, + "outputs": [], + "source": [ + "from google.colab import auth\n", + "from oauth2client.client import GoogleCredentials\n", + "\n", + "FILE_URL = \"https://github.com/datagabe/hollywood/raw/main/sample_tax_information.zip\"\n", + "\n", + "!wget {FILE_URL} -O dataset.zip\n", + "!mkdir dataset\n", + "!unzip dataset.zip -d dataset" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Loading Data from a Directory\n", + "\n", + "Once you have downloaded the dataset from Google Drive, and it is already unzipped, you will leverage Langchain's DirectoryLoader loader to chunk the documents before ingesting them to your SingleStore DB." + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "metadata": {}, + "outputs": [], + "source": [ + "import unstructured\n", + "from langchain.document_loaders import DirectoryLoader\n", + "\n", + "loader = DirectoryLoader('dataset')\n", + "\n", + "docs = loader.load()" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Splitting the Data\n", + "\n", + "To process the data more efficiently, we'll split the loaded content into smaller chunks. The RecursiveCharacterTextSplitter class helps in achieving this by dividing the data based on specified character limits." + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "metadata": {}, + "outputs": [], + "source": [ + "from langchain.text_splitter import RecursiveCharacterTextSplitter\n", + "\n", + "text_splitter = RecursiveCharacterTextSplitter(chunk_size=2000, chunk_overlap=50)\n", + "all_splits = text_splitter.split_documents(docs)" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Setting Up SingleStoreDB with Vertex AI Embeddings\n", + "\n", + "For efficient storage and retrieval of our data, we use SingleStoreDB in conjunction with Vertex AI embeddings. The following cell sets up the necessary environment variables and initializes the SingleStoreDB instance with Vertex AI embeddings." + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "metadata": {}, + "outputs": [], + "source": [ + "from langchain.embeddings import VertexAIEmbeddings\n", + "\n", + "# Init Vertex AI Platform\n", + "aiplatform.init(project=\"\", location=\"us-central1\")\n", + "\n", + "# Generate embeddings and ingest documents\n", + "vectorstore = SingleStoreDB.from_documents(documents=all_splits, embedding=VertexAIEmbeddings(model_name=\"textembedding-gecko@003\"))" + ] + }, + { + "cell_type": "markdown", + "id": "2061618b-db57-4f41-a856-2d7ce69f5025", + "metadata": {}, + "source": [ + "
\n", + "
" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.11.6" + } + }, + "nbformat": 4, + "nbformat_minor": 4 +} diff --git a/notebooks/rag-example/meta.toml b/notebooks/rag-example/meta.toml new file mode 100644 index 00000000..d0b2e319 --- /dev/null +++ b/notebooks/rag-example/meta.toml @@ -0,0 +1,9 @@ +[meta] +title="Using RAG with SingleStoreDB" +description="""\ + Leverage the RAG pattern in the context of the generative AI + lifecycle patterns. + """ +icon="crystal-ball" +tags=["ai"] +destinations=["spaces"] diff --git a/notebooks/rag-example/notebook.ipynb b/notebooks/rag-example/notebook.ipynb new file mode 100644 index 00000000..38fdc597 --- /dev/null +++ b/notebooks/rag-example/notebook.ipynb @@ -0,0 +1,443 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "aec0e83d-36ee-4347-9e86-61642d606aaf", + "metadata": {}, + "source": [ + "
\n", + "
\n", + " \n", + "
\n", + "
\n", + "
SingleStore Notebooks
\n", + "

Using RAG with SingleStoreDB

\n", + "
\n", + "
" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Vertex AI, a product by Google Cloud, offers an integrated suite of machine learning tools that allows developers to build, deploy, and scale AI models faster than ever. On the other hand, SingleStoreDB offers a fast, scalable, and SQL-compliant relational database system. By combining the power of Vertex AI's machine learning capabilities with the efficient storage and retrieval mechanisms of SingleStoreDB, we can create robust AI applications that respond to user queries in real-time.\n", + "\n", + "## RAG with Google Gemini Pro and SingleStore\n", + "\n", + "This example leverages the RAG Pattern in the context of the Generative AI Lifecycle Patterns depicted in this [blogpost by Dr. Ali Arsanjani](https://dr-arsanjani.medium.com/the-generative-ai-lifecycle-1b0c7d9463ec).\n", + "\n", + "## What You'll Learn\n", + "\n", + "* Setting up your environment with the necessary packages and credentials.\n", + "* How Vector Similarity Search can be achieved by leveraging a SingleStore database.\n", + "* How to implement the RAG Technique.\n", + "* How to work with results from the TextGeneration API from Google Vertex AI.\n", + "\n", + "## Prerequisites\n", + "* Basic knowledge of Python programming.\n", + "* Familiarity with Google Cloud services and SQL databases.\n", + "* An active Google Cloud account.\n", + "* A SingleStoreDB hosted or self-managed instance." + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Setup" + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "metadata": {}, + "outputs": [], + "source": [ + "%pip install --quiet google-cloud-aiplatform singlestoredb" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "metadata": {}, + "outputs": [], + "source": [ + "from google.colab import auth as google_auth\n", + "google_auth.authenticate_user()\n", + "\n", + "import vertexai\n", + "from google.cloud import aiplatform\n", + "from vertexai.language_models import TextEmbeddingModel\n", + "from vertexai.preview import generative_models\n", + "\n", + "import singlestoredb as s2\n", + "import json\n", + "\n", + "from IPython.display import display, Markdown, Latex" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Parameters" + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "metadata": {}, + "outputs": [], + "source": [ + "# GCP Parameters\n", + "PROJECT = \"\"\n", + "LOCATION = \"us-central1\"\n", + "\n", + "#LLM\n", + "MODEL = \"gemini-pro\"\n", + "TEMPERATURE = 0.1\n", + "TOP_K = 1\n", + "TOP_P = 1\n", + "MAX_OUTPUT_TOKENS = 2048\n", + "\n", + "# Init AI Platform\n", + "aiplatform.init(project=PROJECT, location=LOCATION)\n", + "model = generative_models.GenerativeModel(MODEL)\n", + "\n", + "# Doc Similarity Threshold\n", + "threshold = 0.7" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Connect to SingleStoreDB" + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "metadata": {}, + "outputs": [], + "source": [ + "connection = s2.connect()" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Function definitions\n", + "\n", + "### About RAG\n", + "\n", + "Access similar documents using semantic search. How is this done? A set of documents you supply are chunked (read \u2018split\u2019) up (sentence by sentence or by paragraph, or by page, etc.) then converted into an embedding with a Vector Embedding like textembedding-gecko@latest and then stored in a Vector Database such as Google\u2019s Vertex Vector Search. The retrieval is done via an Approximate Nearest Neighbor search (ANN) aka semantic search algorithm. This input may significantly decrease the possibility of the model\u2019s hallucination and provide the model with enough relevant context so as to be more knowledgeable about the topic and return more \u2018sensible\u2019 and relevant completions. This process is known as Retrieval Augmented Generation or RAG. So RAG it.\n", + "\n", + "### RAG Steps\n", + "\n", + "1. Creating an **initial prompt** from the user\u2019s query or statement.\n", + "2. Augmenting the prompt with **context** retrieved from the Vector Store.\n", + "3. **Sending** the augmented prompt to the LLM.\n", + "\n", + "### RAG Implementation\n", + "\n", + "In the case of this example the RAG technique is implemented through the ask_question method.\n", + "\n", + "```python\n", + "def ask_question(query,model):\n", + " # Vector Similary Search\n", + " results = query_s2(query)\n", + " filtered_results = filter_threshold(results, threshold)\n", + " # Check if there are documents within the threshold\n", + " if len(filtered_results)==0:\n", + " return(\"I'm Sorry, I don't know that.\")\n", + " unique_results = filter_unique_docs(filtered_results)\n", + " # Context Preparation\n", + " context = get_context(unique_results)\n", + " # LLM Query\n", + " answer, verification = process_llm(query,context,model,temperature)\n", + " return(context, answer, verification)\n", + " ```\n", + "\n", + "\n", + "#### Retrieval\n", + "\n", + "In the first step, Vector Similarity Search is performed on a SingleStore database that has embeddings stored.\n", + "\n", + "The database structure is the following:\n", + "\n", + "**Table name**: embeddings\n", + "\n", + "**Columns**:\n", + " * *content*: Contains the text extracted from the document chunk during ingesion.\n", + " * *vector*: The embedding generated from the content.\n", + " * *metadata*: Contains information about the chunk: page, and document name.\n", + "\n", + "\n", + "##### Results Filtering\n", + "\n", + "In the parameter section of this notebook, you will see there is threshold parameter, this parameter defines what is the minimum similarity between the query and the document, in order to consider the document as part of the context, if the similarity score is larger, the documents are more relevant.\n", + "\n", + "Additinally, in this example, there is a filtering in place to ensure there are no duplicate combination of document/page pairs.\n", + "\n", + "#### Prompt creation\n", + "\n", + "In this case the prompt template is predefined, and the content of the documents obtained from the SingleStore database is injected into the prompt.\n", + "\n", + "**Initial Prompt**\n", + "\n", + "```\n", + "SYSTEM: You are an intelligent assistant helping the users with their questions.\n", + "\n", + "Strictly Use ONLY the following pieces of context to answer the question at the end. Think step-by-step and then answer.\n", + "\n", + "Do not try to make up an answer:\n", + " - If the answer to the question cannot be determined from the context alone, say \"I cannot determine the answer to that.\"\n", + " - If the context is empty, just say \"I do not know the answer to that.\"\n", + "\n", + "=============\n", + "{context}\n", + "=============\n", + "\n", + "Question: {question}\n", + "Helpful Answer:\n", + "```\n", + "\n", + "Once a result is obtained from the LLM, this result will be processed through a second prompt to obtain a verification, where the LLM will assess the result and assign an score. Even though this step is optional, it is a good idea to consider result verification.\n", + "\n", + "**Verification Prompt**\n", + "\n", + "```\n", + "Does the following Answer is a good answer for the following question? Return the answer as a value from 0 to 5, where 0 is not a good answer and 5 is a good answer.\n", + "Provide an explanation of why you used that score.\n", + "\n", + "QUESTION: {question}\n", + "\n", + "ANSWER: {answer}\n", + "\n", + "Answer:\n", + "```\n", + "\n", + "#### Result presentation\n", + "\n", + "Finally, results are presented through the format_answer function, which uses Markdown to present the initial answer, the verification answer, and the content." + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "metadata": {}, + "outputs": [], + "source": [ + "def text_embedding(query):\n", + " \"\"\"Text embedding with a Large Language Model.\"\"\"\n", + " model = TextEmbeddingModel.from_pretrained(\"textembedding-gecko@003\")\n", + " #Working with a single\n", + " embeddings = model.get_embeddings([query])\n", + " return embeddings[0].values\n", + "\n", + "\n", + "def query_s2(query):\n", + " query_embeddings = json.dumps(text_embedding(query))\n", + " num_rows = 4\n", + " statement = f\"\"\"\n", + " SELECT\n", + " content, metadata,\n", + " DOT_PRODUCT(JSON_ARRAY_PACK('{query_embeddings}'), vector) AS score\n", + " FROM embeddings\n", + " ORDER BY score DESC LIMIT {num_rows}\n", + " \"\"\"\n", + "\n", + " # Execute the SQL statement\n", + " cursor = connection.cursor()\n", + " cursor.execute(statement)\n", + " try:\n", + " results = cursor.fetchall()\n", + " return results\n", + " except:\n", + " print(\"Error\")\n", + "\n", + "\n", + "def filter_threshold(results,threshold):\n", + " filtered_results = []\n", + " for doc in results:\n", + " if doc[2] > threshold:\n", + " filtered_results.append(doc)\n", + " return filtered_results\n", + "\n", + "\n", + "def filter_unique_docs(filtered_results):\n", + " unique_docs = []\n", + " last_doc_name = \"\"\n", + " for result in filtered_results:\n", + " doc_name= result[1]['source']\n", + " if doc_name != last_doc_name:\n", + " unique_docs.append(result)\n", + " last_doc_name = doc_name\n", + " return unique_docs\n", + "\n", + "\n", + "def get_context(unique_results):\n", + " context = \"\"\n", + " for doc in unique_results:\n", + " text = doc[0]\n", + " doc_metadata = doc[1]\n", + " context = context+\"\\n----\"\n", + " context = context+f\"\\nThis information is contained on the document {doc_metadata['source']}\"\n", + " context = context+\"\\n--\"\n", + " context = context+\"\\n\"+text\n", + " return context\n", + "\n", + "\n", + "def process_llm(query, context, model):\n", + " responses = model.generate_content(\n", + " [template.format(question=query, context=context)],\n", + " generation_config={\n", + " \"max_output_tokens\": MAX_OUTPUT_TOKENS,\n", + " \"temperature\": TEMPERATURE,\n", + " \"top_p\": TOP_P,\n", + " \"top_k\": TOP_K,\n", + " },\n", + " stream=False)\n", + " return responses.candidates[0].content.parts[0]\n", + "\n", + "\n", + "def ask_question(query,model):\n", + " # Vector Similarity Search\n", + " results = query_s2(query)\n", + " filtered_results = filter_threshold(results, threshold)\n", + " # Check if there are documents within the threshold\n", + " if len(filtered_results)==0:\n", + " return \"I'm Sorry, I don't know that.\"\n", + " unique_results = filter_unique_docs(filtered_results)\n", + " # Context Preparation\n", + " context = get_context(unique_results)\n", + " # LLM Query\n", + " answer = process_llm(query,context,model)\n", + " return context, answer\n", + "\n", + "# LLM Query\n", + "template = \"\"\"SYSTEM: You are an intelligent assistant helping the users with their questions.\n", + "\n", + "Strictly Use ONLY the following pieces of context to answer the question at the end. Think step-by-step and then answer.\n", + "\n", + "Do not try to make up an answer:\n", + " - If the answer to the question cannot be determined from the context alone, say \"I cannot determine the answer to that.\"\n", + " - If the context is empty, just say \"I do not know the answer to that.\"\n", + "\n", + "=============\n", + "{context}\n", + "=============\n", + "\n", + "Question: {question}\n", + "Helpful Answer:\n", + "\"\"\"\n", + "\n", + "# Verfication Query\n", + "verification_template = \"\"\"\"\n", + "Does the following Answer is a good answer for the following question? Return the answer as a value from 0 to 5, where 0 is not a good answer and 5 is a good answer.\n", + "Provide an explanation of why you used that score.\n", + "\n", + "QUESTION: {question}\n", + "\n", + "ANSWER: {answer}\n", + "\n", + "Answer:\n", + "\"\"\"\n", + "\n", + "\n", + "def format_answer(answer):\n", + " if type(answer) == tuple:\n", + " answer_sanitized = str(answer[1].text).replace(\"$\",\"\\$\").replace(\"#\",\"\\#\")\n", + " context_sanitized = answer[0].replace(\"$\",\"\\$\").replace(\"#\",\"\\#\")\n", + " display(Markdown(f\"### Answer\\n {answer_sanitized}\\n\\n
Context{context_sanitized}
\"))\n", + " else:\n", + " display(Markdown(f\"### Answer\\n {answer}\"))" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Examples" + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "metadata": {}, + "outputs": [], + "source": [ + "query = \"What is a form W-2 for?\"\n", + "format_answer(ask_question(query, model))" + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "metadata": {}, + "outputs": [], + "source": [ + "query = \"What is the fastest way to get a tax refund?\" #@param {type:\"string\"}\n", + "format_answer(ask_question(query, model))" + ] + }, + { + "cell_type": "code", + "execution_count": 8, + "metadata": {}, + "outputs": [], + "source": [ + "query = \"what is a form 8922?\" #@param {type:\"string\"}\n", + "format_answer(ask_question(query, model))" + ] + }, + { + "cell_type": "code", + "execution_count": 9, + "metadata": {}, + "outputs": [], + "source": [ + "query = \"Should I buy a yatch?\" #@param {type:\"string\"}\n", + "format_answer(ask_question(query, model))" + ] + }, + { + "cell_type": "markdown", + "id": "cff9b420-2afe-4e28-b45a-0c88c6ded05e", + "metadata": {}, + "source": [ + "
\n", + "
" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.11.6" + } + }, + "nbformat": 4, + "nbformat_minor": 4 +}