The Reactome ChatBot is an interactive tool that provides information about biological entities and processes using Advanced RAG techniques. It leverages the Reactome database to retrieve relevant information based on user queries.
- Minimum requirements:
- Python 3.12
- Poetry
1.8.*
- Requirements for running the complete application:
Follow these steps to run the barebones Chainlit application.
- Clone the repository:
git clone https://github.com/reactome/reactome_chatbot.git
- Navigate to the project directory:
cd reactome_chatbot
- Install dependencies using Poetry:
poetry install
- Verify your
PYTHONPATH
environment variable includes./src
:echo $PYTHONPATH # ./src
- List embeddings available for download:
./bin/embeddings_manager ls-remote
- Install your chosen embeddings:
./bin/embeddings_manager install openai/text-embedding-3-large/reactome/ReleaseXX
- Run the Chainlit application:
chainlit run bin/chat-chainlit.py -w
- Access the app at http://localhost:8000 🎉
The project uses Docker Compose to manage the PostgreSQL database. The configuration for the database is stored in the docker-compose.yml
file, and the environment variables are stored in the .env
file.
Follow these steps to run the complete application in Docker.
- Create a copy of the
env_template
file and name it.env
:cp env_template .env
- Configure the application by editing environment variables in
.env
:OPENAI_API_KEY
: add your OpenAI key.CLOUDFLARE_SECRET_KEY
: keep blank to disable captcha.CHAINLIT_IMAGE=reactome-chatbot
: set this to use your local docker build.- Use the following variables to configure Auth0:
- This will enable Chainlit user-login and chat history.
OAUTH_AUTH0_CLIENT_ID OAUTH_AUTH0_CLIENT_SECRET OAUTH_AUTH0_DOMAIN
- List embeddings available for download:
docker compose run --rm chainlit /bin/bash -c "./bin/embeddings_manager ls-remote"
- Install your chosen embeddings:
docker compose run --rm chainlit /bin/bash -c "./bin/embeddings_manager install openai/text-embedding-3-large/reactome/ReleaseXX"
- Build the Docker image (do this every time you make local changes):
docker build -t reactome-chatbot .
- Start the Chainlit application and PostgrSQL database in Docker containers:
docker-compose up # To run it in the background, use: # docker-compose up -d
- Access the app at http://localhost:8000 🎉
The ChatBot's knowledge of a given data source is generated using the latest data release, resulting in a bundle of embedded information and/or text documents. For simplicity, we refer to these bundles as Embeddings throughout this document.
In the case of Reactome, embeddings bundles are generated once per release from reactome/graphdb releases from DockerHub and uploaded to AWS S3 for easy retrieval.
All aspects of generating, managing, uploading, and retrieving embeddings bundles are handled by the ./bin/embeddings_manager
script.
- Basic usage is covered in the Quick Start guide above.
- See the Embeddings Manager documentation for more information.
To do main consistency checks
poetry run ruff check .
To make style consistent
poetry run black .
To make sure imports are organized
poetry run isort .
Contributions to the Reactome ChatBot project are welcome! If you encounter any issues or have suggestions for improvements, feel free to open an issue or submit a pull request.
Please make sure to follow our contributing guidelines and code of conduct.
This project is licensed under the MIT License.