Rag Doll is a chat-with-your-documents style Retrieval Augmented Generation (RAG), which is a specialised use of a Large Language Model (LLM) where items from a knowledge base get added to the prompt for better answers.
There are many RAG implementations out there and I don't proclaim this one to be better than any of the others. Rag Doll does not support multi-modal chat at this time. Maybe later, feel free to suggest a pull request. :-)
The implementation is mostly Python, although the heavy lifting is done by pre-trained machine learning models. You'll want to run this on something with a decent GPU, or you will find this all to be very slow. Rag Doll is broken up into several containers, each with a single responsibility (or as close to that as I could get). Containerising makes it easier to upgrade and improve individual components.
The assistant handles queries to the RAG for us. It awaits messages from the user chat queue, queries the knowledge base and builds a prompt for the LLM.
We use OpenAI as the model run-time. OpenAI provides robust capabilities for managing multiple models and handling large model files. It simplifies the integration process by managing registrations and pulling models as needed.
.env |
default | description |
---|---|---|
ASSISTANT_PORT |
5001 | The port used by the Assistant for healthy check purpose. |
ASSISTANT_LANGUAGES |
en , CHANGEME, CHANGEME |
A comma-separated list of ISO 639-1 language codes. Be sure to add a section to the system prompt that describes these languages. |
OPENAI_API_KEY |
CHANGEME | The API key for authenticating with OpenAI services. |
OPENAI_CHAT_MODEL |
gpt-4o |
The LLM model that is used to handle chat messages. Read more about OpenAI models |
CHROMADB_DISTANCE_CUTOFF |
1.5 |
The minimum vector distance needed for a chunk for the chunk to be included in the prompt as RAG context. Chunks with a higher distance are discarded from the RAG query results. |
For the final configuration, be sure to add one each of system prompt, RAG
prompt and RAG-less prompt for all langauges in ASSISTANT_LANGUAGES
. This
gives the system a specific set of prompts for each language. All language codes
are ISO 639-1 codes.
Note: The OPENAI_API_KEY does not need to be explicitly called in
assistant.py
because the openai library automatically reads it from the environment variables
when openai.OpenAI()
is instantiated.
The EPPO librarian is responsible for getting the EPPO Global Database data sheet data into the vector database. It runs at startup, recreating the data set that is to be used for the retrieval part of the system.
The EPPO Global Database is a collection of technical resources that researchers can use in their work. As quoted from their website: EPPO Global Database is maintained by the Secretariat of the European and Mediterranean Plant Protection Organization (EPPO). The aim of the database is to provide all pest-specific information that has been produced or collected by EPPO. The database contents are constantly being updated by the EPPO Secretariat.
.env |
default | description |
---|---|---|
CHROMADB_COLLECTION_TEMPLATE |
EPPO-datasheets-{} | The template for the names of the ChromaDB collections where each translation of the EPPO datasheets will be stored. This should have one {} placeholder. |
EPPO_COUNTRY_ORGANISM_URL |
https://gd.eppo.int/country/{country}/organisms.csv | The URL to the per-country organism list on the EPPO database. Use {country} as placeholder for the country to query for. |
EPPO_DATASHEET_URL |
https://gd.eppo.int/taxon/{eppo_code}/datasheet | The URL to the organism datasheet in the EPPO database. Use {eppo_code} as placeholder for the EPPO code. |
EPPO_COUNTRIES |
CHANGEME | A comma-separated list of ISO 3166-1 alpha-2 country codes of countries that you are interested in. |
OPENAI_API_KEY |
CHANGEME | The API key for authenticating with OpenAI services. |
OPENAI_CHAT_MODEL |
gpt-4o |
The LLM model that is used to handle plain text translation |
CHUNK_SIZE |
5 | For small data sets, a few sentences will have to do. |
OVERLAP_SIZE |
1 | The EPPO librarian uses rooftiling. This is the overlap. |
PLAIN_TEXT_SYSTEM_PROMPT |
CHANGEME | The system prompt for translating the scientific text into plain language. |
PLAIN_TEXT_PROMPT |
CHANGEME | The prompt template for translating the scientific text into plain language. Must have a {text} placeholder. |
EPPO_PORT |
5002 | The port used by the eppo librarian for healthy check purpose. |
EPPO is not completely clear on what license they expect. They do not restrict accessing the datasheets. They do ask for citation, which we provide.
The vector database takes care of embedding and semantic search on the knowledge base library. Rag doll uses Chroma DB, being lightweigth and easy to interface with.
See also Running Chroma.
.env |
default | description |
---|---|---|
CHROMADB_HOST |
chromadb | The hostname of the vector database container. |
CHROMADB_HOST |
8000 | The port that the vector database container listens on. |
In order to communicate between the services we use message queues. This allows us to organise and scale workloads, while having each component have only a single responsibility.
user message:
field | data type | description |
---|---|---|
id |
string | Message identification number as the originating platform knows it. |
timestamp |
ISO8601 UTC | Message timestamp as the originating platform knows it. |
platform |
enum: SLACK /WHATSAPP /SMS /VOICE |
Originating platform. Intended to be able to parse the platform-specific fields. |
from |
platform-specific address | Enough information for the originating platform to be able to route a reply to this message to where the user expects it. |
text |
UTF-8 string | The text as provided by the user. |
from field (where platform
equals SLACK
or WHATSAPP
):
platform Slack format...
platform WhatsApp format... E.164 numbers
.env |
default | description |
---|---|---|
RABBITMQ_USER |
rabbit | The user name for RabbitMQ. |
RABBITMQ_PASS |
CHANGEME | The default password for accessing queues. Use a generated string. |
RABBITMQ_QUEUE_USER_CHATS |
user_chats | The queue for chat messages that the user typed. |
RABBITMQ_QUEUE_USER_CHAT_REPLIES |
user_chat_replies | The queue for chat messages that the assisant got from the LLM. |
RABBITMQ_EXCHANGE_USER_CHATS |
user_chats_exchange | The topic exchange that routes messages to queues. |
RABBITMQ_HOST |
rabbitmq | The host that RabbitMQ runs on. |
RABBITMQ_PORT |
5672 | The AMQP port of RabbitMQ. |
RABBITMQ_MANAGEMENT_PORT |
15672 | The HTTP port for the management web UI of RabbitMQ. |
The backend of this project is built using FastAPI, a modern and high-performance web framework for building APIs with Python 3.12.3. The backend communicates with a PostgreSQL database to manage and store application data. The PostgreSQL database is initialized with predefined scripts located in the ./postgres/docker-entrypoint-initdb.d directory, ensuring that the database schema and initial data are set up automatically. Additionally, a PgAdmin4 service is provided to offer a user-friendly interface for managing the PostgreSQL database. PgAdmin4 is configured to run on port 5050 and can be accessed using the default credentials specified in the environment variables.
.env |
default | description |
---|---|---|
BACKEND_PORT |
5000 | The external port used by the Backend |
JWT_SECRET |
CHANGEME | JWT-based auth secret key, used in the process of signing a token |
WEBDOMAIN |
"http://localhost" | The base URL of the web application |
MAGIC_LINK_CHAT_TEMPLATE |
CHANGEME | A template for magic link message, e.g. "You can login into Agriconnect by clicking this link: {magic_link}" |
GOOGLE_APPLICATION_CREDENTIALS_PATH |
CHANGEME | Path to the service account JSON key file location used for authentication and accessing Google Cloud services (Development only) |
GOOGLE_APPLICATION_CREDENTIALS |
CHANGEME | JSON key file name used for authentication and accessing Google Cloud services |
BUCKET_NAME |
CHANGEME | Bucket name for a storage object (offered by Google Cloud) |
TESTING |
None |
An environment variable used for testing purposes when running backend tests. This variable is automatically set to 1 by conftest to mock or skip certain steps related to third-party services. Please note that TESTING should not be included in the Docker Compose environment. |
INITIAL_CHAT_TEMPLATE |
CHANGEME | A template for initial chat message, e.g. "Hi {farmer_name}, I'm {officer_name} the extension officer. Welcome to Agriconnect, send us a message here to start chatting." The template should contains {farmer_name} and {officer_name} |
LAST_MESSAGES_LIMIT |
10 | The maximum number of last messages to resend to a user in a chat session. |
ASSISTANT_LAST_MESSAGES_LIMIT |
10 | The maximum number of previous chat messages to retrieve and feed into the assistant for generating suggestions. |
NEXT_PUBLIC_VAPID_PUBLIC_KEY |
CHANGEME | The public key for web push notification generated by web-push |
NEXT_PUBLIC_VAPID_PRIVATE_KEY |
CHANGEME | The private key for web push notification generated by web-push |
CHROMADB_HOST |
chromadb | The hostname of the vector database container, for healthy check purpose. |
CHROMADB_HOST |
8000 | The port that the vector database container listens on, for healthy check purpose. |
Before using the application, you can seed the database with user, client, and chat session data using the chat_session
seeder. Follow the instructions below to set up and run the seeder.
Prepare a Google Sheet with the following columns (or you can use this template):
client_phone_number
: Phone number of the client (including the + sign).client_name
: Name of the client (can be empty).linked_to_user_phone_number
: Phone number of the user linked to the client (including the + sign).user_name
: Name of the user (can be empty).
Ensure that the Google Sheet is publicly accessible.
-
Save your data in the prepared Google Sheet template.
-
From the backend directory, run the following command:
python -m seeder.chat_session
-
The script will prompt you for the Google Sheet ID, which can be found in the URL of the Google Sheet. Enter the ID and press Enter.
-
The seeder will process the data and populate your database with the user, client, and chat session information.
In the backend, we handle Twilio's send and receive messages through a service
called TwilioClient
. Currently, we only support WhatsApp text messages.
When started, TwilioClient
listens to incoming messages from Twilio using a
webhook. TwilioClient
will use the frontend port proxy to point to the Twilio
callback URL. In Twilio, configure the sandbox webhook URL to be the external
URL for your TwilioClient
routes.
The TwilioClient
connects to the message queue to interact with the rest of
the system, notably the assistant. Incoming messages are forwarded to the
RABBITMQ_QUEUE_USER_CHATS
queue and replies coming from the
RABBITMQ_QUEUE_USER_CHAT_REPLIES
queue are posted back to the user via Twilio.
.env |
default | description |
---|---|---|
TWILIO_ACCOUNT_SID |
CHANGEME | The Account SID for your Twilio account. |
TWILIO_AUTH_TOKEN |
CHANGEME | Your Twilio authorization token. |
TWILIO_WHATSAPP_NUMBER |
CHANGEME | The Twilio WhatsApp number from your Twilio account in international format. |
VERIFICATION_TEMPLATE_ID_en |
NULL | The Twilio message template ID for the verification message in English. This template should contain two content variables: {"1": extension_officer_name, "2": verification_link} . Leave blank for local development. |
VERIFICATION_TEMPLATE_ID_sw |
NULL | The Twilio message template ID for the verification message in Swahili. This template should contain two content variables: {"1": extension_officer_name, "2": verification_link} . Leave blank for local development. |
VERIFICATION_TEMPLATE_ID_fr |
NULL | The Twilio message template ID for the verification message in French. This template should contain two content variables: {"1": extension_officer_name, "2": verification_link} . Leave blank for local development. |
BROADCAST_TEMPLATE_ID_en |
NULL | The Twilio message template ID for the broadcast message in English. This template should contain two content variables: {"1": farmer_name, "2": broadcast_message_without_new_line} . Leave blank for local development. |
BROADCAST_TEMPLATE_ID_sw |
NULL | The Twilio message template ID for the broadcast message in Swahili. This template should contain two content variables: {"1": farmer_name, "2": broadcast_message_without_new_line} . Leave blank for local development. |
BROADCAST_TEMPLATE_ID_fr |
NULL | The Twilio message template ID for the broadcast message in French. This template should contain two content variables: {"1": farmer_name, "2": broadcast_message_without_new_line} . Leave blank for local development. |
INTRO_TEMPLATE_ID_en |
NULL | The Twilio message template ID for the introduction message in English. This template should contain two content variables: {"1": farmer_name, "2": extension_officer_name} . Leave blank for local development. |
INTRO_TEMPLATE_ID_sw |
NULL | The Twilio message template ID for the introduction message in Swahili. This template should contain two content variables: {"1": farmer_name, "2": extension_officer_name} . Leave blank for local development. |
INTRO_TEMPLATE_ID_fr |
NULL | The Twilio message template ID for the introduction message in French. This template should contain two content variables: {"1": farmer_name, "2": extension_officer_name} . Leave blank for local development. |
CONVERSATION_RECONNECT_TEMPLATE_en |
NULL | The Twilio message template ID for the conversation reconnect message in English. This is used when an officer sends a message to a farmer beyond the 24-hour window. The template should contain one content variable: {"1": farmer_name} . Leave blank for local development. |
CONVERSATION_RECONNECT_TEMPLATE_sw |
NULL | The Twilio message template ID for the conversation reconnect message in Swahili. This is used when an officer sends a message to a farmer beyond the 24-hour window. The template should contain one content variable: {"1": farmer_name} . Leave blank for local development. |
CONVERSATION_RECONNECT_TEMPLATE_fr |
NULL | The Twilio message template ID for the conversation reconnect message in French. This is used when an officer sends a message to a farmer beyond the 24-hour window. The template should contain one content variable: {"1": farmer_name} . Leave blank for local development. |
By default, when the app starts, a command is executed to fetch the Twilio message templates. This command generates a JSON file located in the ./sources
folder. The purpose of this file is to minimize Twilio API calls when saving message templates into our database as part of the chat history.
One important consideration is that if you update the message template in the Twilio Console, you must also update the Message Template ID environment variable to match the new template ID in the Twilio Console > Content Template Builder. After updating the environment variable, run the following command inside the backend container to refresh the JSON file:
python -m command.get_twilio_message_template
Slack is one of the messaging platforms that can be used to chat with Rag Doll. Most of the Slack interface code was taken from Getting started with Bolt for Python.
The Slack client in the backend listens to incoming messages using a web hook, which is handled nicely by the Bolt framework.
.env |
default | description |
---|---|---|
SLACK_BOT_TOKEN |
CHANGEME | The token for your Slack bot. |
SLACK_SIGNING_SECRET |
CHANGEME | The signing secret for your Slack bot. |
When installing the the backend as Slack App, you can use
backend/slackbot-app-manifest.yml
as a template. Before using it, change the
following values:
backend/slackbot-app-manifest.yml |
default | description |
---|---|---|
description |
CHANGEME | A brief description of the purpose of the bot. |
background_color |
CHANGEME | The 6-digit hex colour code for the Slack bot background. |
display_name |
CHANGEME | The display name of the Slack bot. This is wat people in your workspace will see. |
request_url |
http:// CHANGEME/slack/events |
The external URL that Slack's servers will use to call the Slack bot component. Replace CHANGEME with the external IP address you reserved for your Google Cloud VM running the components. |
You will also want to upload a nice avatar image to go with your bot.
The frontend of this project is developed using React with Next.js. In the development environment, the frontend and backend services are configured to facilitate efficient and streamlined development. The frontend, built with React and Next.js, communicates with the backend API using a proxy setup defined in the next.config.js file. This configuration rewrites requests matching the pattern /api/:path* to be forwarded to the backend service at http://backend:5000/api/:path*. This proxy setup simplifies the API call structure during development, allowing developers to interact with the backend as if it were part of the same application.
.env |
default | description |
---|---|---|
FRONTEND_PORT |
3001 | The external port used by the Frontend |
NEXT_PUBLIC_VAPID_PUBLIC_KEY |
CHANGEME | The public key for web push notification generated by web-push |
NEXT_PUBLIC_VAPID_PRIVATE_KEY |
CHANGEME | The private key for web push notification generated by web-push |
Frontend: http://localhost:${FRONTEND_PORT} API Docs: http://localhost:${FRONTEND_PORT}/api/docs#/
/** @type {import('next').NextConfig} */
const nextConfig = {
async rewrites() {
return [
{
source: "/api/:path*",
destination: "http://backend:5000/api/:path*", // Proxy to Backend
},
];
},
};
export default nextConfig;
In the production environment, the interaction between the frontend and backend is handled differently to optimize performance and security. Instead of using the proxy setup defined in the development configuration, the frontend and backend services communicate through an Nginx server. The Nginx configuration, located in the frontend folder, acts as a reverse proxy, efficiently routing requests from the frontend to the backend.
This project uses PostgreSQL as the backend database.
.env |
default | description |
---|---|---|
POSTGRES_PORT |
5432 | The external port used by the Database |
POSTGRES_PASS |
CHANGEME | The default password for accessing Database |
PGADMIN_PORT |
5050 | The external port used by pgadmin page |
This chapter gives a list of items that you should consider as you deploy the code from this repository. The description assumes you will be deploying to Google Cloud, so if you deploy on a different cloud provider you may see things that are different.
Cached Docker files and images consume a lot of disk space. The stock 10GB disks won't be large enough for Rag Doll, so you probably want to allocate 100GB instead. Depending on how you like to organise disks you can get extra attached storage or just start with larger root disks.
Reserve a static IP address for the webhook calls from Twilio and Slack.
Getting Rag Doll running is a two-step process: first set up your .env
file.
The repository contains a template that you can use. It has reasonably sane
defaults for most variables. All that you have to do is add keys and passwords
and you should be good to go.
Copy env.template
to .env
and edit that file with your favourite editor. In
the template, search for CHANGEME
and replace that placeholder with your own
key or generated password. Please do not reuse passwords from other places, but
us a password generator. You won't have to type them, so making them strong is
just as much and as little work as making them weak.
$ cp env.template .env
$ vi .env
All variables are documented in the component documentation sections above. With
.env
set up, all but one component of Rag Doll can be started with the
following command:
$ docker compose up
It is well know that Docker eats disk space relentlessly. One particular problem
is that the default logger format for Docker, json-file
, does not support log
rotation. Instead, switch Docker over to using the local
logging driver. That
does support log rotation. See
Configure logging drivers
for instructions.
$ docker info --format '{{.LoggingDriver}}'
json-file
$ sudo vi /etc/docker/daemon.json
{
"log-driver": "local"
}
$ sudo systemctl restart docker
$ docker info --format '{{.LoggingDriver}}'
local