Skip to content

CassioML/langchain-flare-pdf-qa-demo

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

49 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PDF FLARE demo with Langchain and Cassandra as Vector Store

What

Ingest PDF files from their URL into an Astra DB vector store and run FLARE Question-Answering on them.

Features:

  • Python API (CassIO, LangChain, FastAPI) + React client (Typescript)
  • per-user store of ingested documents
  • Other Q-A methods in comparison
  • Start-with-a-click on Gitpod

For some architectural/flow diagrams, check out this dir.

Prerequisites

You need:

  • an Astra Vector Database (free tier is fine!). You'll be asked to supply a Database Administrator token, the string starting with AstraCS:...;
  • likewise, get your Database ID ready, you will have to enter it;
  • an OpenAI API Key. (More info here, note that out-of-the-box this demo supports OpenAI unless you tinker with the code.)
Note: If you have switched Astra to the New Vector Developer Experience UI, click here for instructions on the DB credentials.

Go to your database dashboard and click on the "Connection Details" button on the right. A dialog will open with instructions for connecting. You'll do two things:

  • click "Generate Token" and copy the AstraCS:... string in its entirety once that appears on the dialog;
  • locate the api_endpoint=... line in the Python code example. The database ID is the sequence after https:// and before the dash + region name (e.g. -us-east1) in the definition of the endpoint. It looks like 01234567-89ab-cdef-0123-456789abcdef (and has always this length).

DB credentials in the Vector Developer Experience

How-to (Gitpod)

Click this button, confirm opening of the workspace (you might need to do a Gitpod login in the process) and wait 1-2 minutes: instructions will show up in the console below, where you'll have to provide connection details and OpenAI key when prompted.

In the meantime, the app will open in the top panel.

How-to (local run)

API

Create a Python 3.8+ virtual environment and install the dependencies in requirements.txt.

Make a copy cp .env.template .env and set the secrets for your DB and OpenAI.

Finally enter the subdirectory and launch the API:

cd api
uvicorn api:app

Use a Cassandra cluster

To use a Cassandra cluster instead of Astra DB, check the .env.template file: uncomment the USE_CASSANDRA_CLUSTER environment variable in your .env and provide the necessary connection parameters (keyspace name, plus: contact points and/or authentication if required).

The next time you start the API, it will attempt connecting to Cassandra.

Client

You need a modern Node.js. Enter the subdirectory and install the dependencies:

cd app
npm install

If the API is running you can launch the client:

npm start

and point your browser to local port 3000.

(Note: if the API run elsewhere, you can launch REACT_APP_API_BASE_URL="http://something..." npm start.)

User journey

First, "log in" (mocked) with a made-up username.

Then you access the panel. Go to the "Docs" panel, where you can load pdf files by entering their URL (click on the "i" icon to get example URLs to paste).

You can "Ask questions", comparing different methods (FLARE/RAG/Plain LLM) and their answers.

About

PDF FLARE demo with Langchain and Cassandra as Vector Store

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published