Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chatbot-rag-app: updates ENV, docker setup and README #364

Merged
merged 5 commits into from
Jan 7, 2025
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 0 additions & 4 deletions example-apps/chatbot-rag-app/.flaskenv

This file was deleted.

7 changes: 2 additions & 5 deletions example-apps/chatbot-rag-app/Dockerfile
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
FROM node:20-alpine AS build-step
FROM node:22-alpine AS build-step
WORKDIR /app
ENV PATH=/node_modules/.bin:$PATH
COPY frontend ./frontend
Expand Down Expand Up @@ -28,7 +28,4 @@ COPY api ./api
COPY data ./data

EXPOSE 4000
# The only thing different from running local is that in docker we need to
# listen on all IPs, not just localhost.
ENV FLASK_RUN_HOST=0.0.0.0
CMD [ "flask", "run"]
CMD [ "python", "api/app.py"]
72 changes: 54 additions & 18 deletions example-apps/chatbot-rag-app/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,11 +26,27 @@ use-cases. Visit the [Install Elasticsearch](https://www.elastic.co/search-labs/

Once you decided your approach, edit your `.env` file accordingly.

### Elasticsearch index and chat_history index
### Running your own Elastic Stack with Docker

By default, the app will use the `workplace-app-docs` index and the chat
history index will be `workplace-app-docs-chat-history`. If you want to change
these, edit `ES_INDEX` and `ES_INDEX_CHAT_HISTORY` entries in your `.env` file.
If you'd like to start Elastic locally, you can use the provided
[docker-compose-elastic.yml](docker-compose-elastic.yml) file. This starts
Elasticsearch, Kibana, and APM Server and only requires Docker installed.

Use docker compose to run Elastic stack in the background:

```bash
docker compose -f docker-compose-elastic.yml up --force-recreate -d
```

Then, you can view Kibana at http://localhost:5601/app/home#/

If asked for a username and password, use username: elastic and password: elastic.

Clean up when finished, like this:

```bash
docker compose -f docker-compose-elastic.yml down
```

## Connecting to LLM

Expand Down Expand Up @@ -67,6 +83,12 @@ docker compose up --build --force-recreate
*Note*: First time creating the index can fail on timeout. Wait a few minutes
and retry.

Clean up when finished, like this:

```bash
docker compose down
```

### Run locally

If you want to run this example with Python and Node.js, you need to do a few
Expand Down Expand Up @@ -95,23 +117,16 @@ correct packages installed:
```bash
python3 -m venv .venv
source .venv/bin/activate
# install dev requirements for pip-compile and dotenv
pip install pip-tools "python-dotenv[cli]"
pip-compile
# Install dotenv which is a portable way to load environment variables.
pip install "python-dotenv[cli]"
pip install -r requirements.txt
```

#### Run the ingest command

First, ingest the data into elasticsearch:
```bash
$ dotenv run -- flask create-index
".elser_model_2" model not available, downloading it now
Model downloaded, starting deployment
Loading data from ./data/data.json
Loaded 15 documents
Split 15 documents into 26 chunks
Creating Elasticsearch sparse vector store in http://localhost:9200
FLASK_APP=api/app.py dotenv run -- flask create-index
```

*Note*: First time creating the index can fail on timeout. Wait a few minutes
Expand All @@ -121,12 +136,33 @@ and retry.

Now, run the app, which listens on http://localhost:4000
```bash
$ dotenv run -- flask run
* Serving Flask app 'api/app.py'
* Debug mode: off
dotenv run -- python api/app.py
```

## Customizing the app
## Advanced

### Updating package versions

To update package versions, recreate [requirements.txt](requirements.txt) and
reinstall like this. Once checked in, any commands above will use updates.

```bash
rm -rf .venv
python3 -m venv .venv
source .venv/bin/activate
# Install dev requirements for pip-compile
pip install pip-tools
# Recreate requirements.txt
pip-compile
# Install main dependencies
pip install -r requirements.txt
```

### Elasticsearch index and chat_history index

By default, the app will use the `workplace-app-docs` index and the chat
history index will be `workplace-app-docs-chat-history`. If you want to change
these, edit `ES_INDEX` and `ES_INDEX_CHAT_HISTORY` entries in your `.env` file.

### Indexing your own data

Expand Down
3 changes: 1 addition & 2 deletions example-apps/chatbot-rag-app/api/app.py
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,5 @@ def create_index():
index_data.main()


# Unless we run through flask, we can miss critical settings or telemetry signals.
if __name__ == "__main__":
raise RuntimeError("Run via the parent directory: 'flask run'")
app.run(host="0.0.0.0", port=4000, debug=False)
49 changes: 18 additions & 31 deletions example-apps/chatbot-rag-app/api/llm_integrations.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,60 +11,47 @@


def init_openai_chat(temperature):
# Include streaming usage as this allows recording of LLM metrics
return ChatOpenAI(
model=os.getenv("CHAT_MODEL"), streaming=True, temperature=temperature
model=os.getenv("CHAT_MODEL"),
streaming=True,
temperature=temperature,
model_kwargs={"stream_options": {"include_usage": True}},
)


def init_vertex_chat(temperature):
VERTEX_PROJECT_ID = os.getenv("VERTEX_PROJECT_ID")
VERTEX_REGION = os.getenv("VERTEX_REGION", "us-central1")
vertexai.init(project=VERTEX_PROJECT_ID, location=VERTEX_REGION)
return ChatVertexAI(streaming=True, temperature=temperature)
return ChatVertexAI(
model_name=os.getenv("CHAT_MODEL"), streaming=True, temperature=temperature
)


def init_azure_chat(temperature):
# Include streaming usage as this allows recording of LLM metrics
return AzureChatOpenAI(
model=os.getenv("CHAT_DEPLOYMENT"), streaming=True, temperature=temperature
model=os.getenv("CHAT_DEPLOYMENT"),
streaming=True,
temperature=temperature,
model_kwargs={"stream_options": {"include_usage": True}},
)


def init_bedrock(temperature):
AWS_ACCESS_KEY = os.getenv("AWS_ACCESS_KEY")
AWS_SECRET_KEY = os.getenv("AWS_SECRET_KEY")
AWS_REGION = os.getenv("AWS_REGION")
AWS_MODEL_ID = os.getenv("AWS_MODEL_ID", "anthropic.claude-v2")
return ChatBedrock(
region_name=AWS_REGION,
aws_access_key_id=AWS_ACCESS_KEY,
aws_secret_access_key=AWS_SECRET_KEY,
model_id=AWS_MODEL_ID,
model_id=os.getenv("CHAT_MODEL"),
streaming=True,
model_kwargs={"temperature": temperature},
)


def init_mistral_chat(temperature):
MISTRAL_API_ENDPOINT = os.getenv("MISTRAL_API_ENDPOINT")
MISTRAL_API_KEY = os.getenv("MISTRAL_API_KEY")
MISTRAL_MODEL = os.getenv("MISTRAL_MODEL", "Mistral-large")
kwargs = {
"mistral_api_key": MISTRAL_API_KEY,
"temperature": temperature,
}
if MISTRAL_API_ENDPOINT:
kwargs["endpoint"] = MISTRAL_API_ENDPOINT
if MISTRAL_MODEL:
kwargs["model"] = MISTRAL_MODEL
return ChatMistralAI(**kwargs)
return ChatMistralAI(
model=os.getenv("CHAT_MODEL"), streaming=True, temperature=temperature
)


def init_cohere_chat(temperature):
COHERE_API_KEY = os.getenv("COHERE_API_KEY")
COHERE_MODEL = os.getenv("COHERE_MODEL")
return ChatCohere(
cohere_api_key=COHERE_API_KEY, model=COHERE_MODEL, temperature=temperature
)
return ChatCohere(model=os.getenv("CHAT_MODEL"), temperature=temperature)


MAP_LLM_TYPE_TO_CHAT_MODEL = {
Expand Down
91 changes: 91 additions & 0 deletions example-apps/chatbot-rag-app/docker-compose-elastic.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,91 @@
name: elastic-stack

services:
elasticsearch:
image: docker.elastic.co/elasticsearch/elasticsearch:8.17.0
container_name: elasticsearch
ports:
- 9200:9200
environment:
- node.name=elasticsearch
- cluster.name=docker-cluster
- discovery.type=single-node
- ELASTIC_PASSWORD=elastic
- bootstrap.memory_lock=true
- xpack.security.enabled=true
- xpack.security.http.ssl.enabled=false
- xpack.security.transport.ssl.enabled=false
- xpack.license.self_generated.type=trial
- ES_JAVA_OPTS=-Xmx8g
ulimits:
memlock:
soft: -1
hard: -1
healthcheck:
test: ["CMD-SHELL", "curl -s http://localhost:9200/_cluster/health?wait_for_status=yellow&timeout=500ms"]
retries: 300
interval: 1s

elasticsearch_settings:
depends_on:
elasticsearch:
condition: service_healthy
image: docker.elastic.co/elasticsearch/elasticsearch:8.17.0
container_name: elasticsearch_settings
restart: 'no'
command: >
bash -c '
# gen-ai assistants in kibana save state in a way that requires security to be enabled, so we need to create
# a kibana system user before starting it.
echo "Setup the kibana_system password";
until curl -s -u "elastic:elastic" -X POST http://elasticsearch:9200/_security/user/kibana_system/_password -d "{\"password\":\"elastic\"}" -H "Content-Type: application/json" | grep -q "^{}"; do sleep 5; done;
'

kibana:
image: docker.elastic.co/kibana/kibana:8.17.0
container_name: kibana
depends_on:
elasticsearch_settings:
condition: service_completed_successfully
ports:
- 5601:5601
environment:
- SERVERNAME=kibana
- ELASTICSEARCH_HOSTS=http://elasticsearch:9200
- ELASTICSEARCH_USERNAME=kibana_system
- ELASTICSEARCH_PASSWORD=elastic
# Non-default settings from here:
# https://github.com/elastic/apm-server/blob/main/testing/docker/kibana/kibana.yml
- MONITORING_UI_CONTAINER_ELASTICSEARCH_ENABLED=true
- XPACK_SECURITY_ENCRYPTIONKEY=fhjskloppd678ehkdfdlliverpoolfcr
- XPACK_ENCRYPTEDSAVEDOBJECTS_ENCRYPTIONKEY=fhjskloppd678ehkdfdlliverpoolfcr
- SERVER_PUBLICBASEURL=http://127.0.0.1:5601
healthcheck:
test: ["CMD-SHELL", "curl -s http://localhost:5601/api/status | grep -q 'All services are available'"]
retries: 300
interval: 1s

apm-server:
image: docker.elastic.co/apm/apm-server:8.17.0
container_name: apm-server
depends_on:
elasticsearch:
condition: service_healthy
command: >
apm-server
-E apm-server.kibana.enabled=true
-E apm-server.kibana.host=http://kibana:5601
-E apm-server.kibana.username=elastic
-E apm-server.kibana.password=elastic
-E output.elasticsearch.hosts=["http://elasticsearch:9200"]
-E output.elasticsearch.username=elastic
-E output.elasticsearch.password=elastic
cap_add: ["CHOWN", "DAC_OVERRIDE", "SETGID", "SETUID"]
cap_drop: ["ALL"]
ports:
- 8200:8200
healthcheck:
test: ["CMD-SHELL", "bash -c 'echo -n > /dev/tcp/127.0.0.1/8200'"]
retries: 300
interval: 1s

15 changes: 13 additions & 2 deletions example-apps/chatbot-rag-app/docker-compose.yml
Original file line number Diff line number Diff line change
@@ -1,13 +1,20 @@
name: chatbot-rag-app

services:
ingest-data:
build:
context: .
container_name: ingest-data
restart: 'no'
environment:
# host.docker.internal means connect to the host machine, e.g. your laptop
ELASTICSEARCH_URL: "http://host.docker.internal:9200"
FLASK_APP: api/app.py
env_file:
- .env
- .flaskenv
command: flask create-index
extra_hosts:
- "host.docker.internal:host-gateway"

api-frontend:
depends_on:
Expand All @@ -16,8 +23,12 @@ services:
container_name: api-frontend
build:
context: .
environment:
# host.docker.internal means connect to the host machine, e.g. your laptop
ELASTICSEARCH_URL: "http://host.docker.internal:9200"
env_file:
- .env
- .flaskenv
ports:
- "4000:4000"
extra_hosts:
- "host.docker.internal:host-gateway"
25 changes: 16 additions & 9 deletions example-apps/chatbot-rag-app/env.example
Original file line number Diff line number Diff line change
Expand Up @@ -28,24 +28,31 @@ ES_INDEX_CHAT_HISTORY=workplace-app-docs-chat-history

# Uncomment and complete if you want to use Bedrock LLM
# LLM_TYPE=bedrock
# AWS_ACCESS_KEY=
# AWS_SECRET_KEY=
# AWS_REGION=
# AWS_MODEL_ID=
# AWS_ACCESS_KEY_ID=
# AWS_SECRET_ACCESS_KEY=
# AWS_DEFAULT_REGION=
# CHAT_MODEL=amazon.titan-text-lite-v1
codefromthecrypt marked this conversation as resolved.
Show resolved Hide resolved

# Uncomment and complete if you want to use Vertex AI
# LLM_TYPE=vertex
# VERTEX_PROJECT_ID=
# VERTEX_REGION=
## Project that has the service "aiplatform.googleapis.com" enabled
# GOOGLE_CLOUD_PROJECT=
# GOOGLE_CLOUD_REGION=
# CHAT_MODEL=gemini-1.5-flash-002
## Needed if you haven't run `gcloud auth application-default login`
# GOOGLE_APPLICATION_CREDENTIALS=

# Uncomment and complete if you want to use Mistral AI
# LLM_TYPE=mistral
## Key in https://console.mistral.ai/api-keys/
# MISTRAL_API_KEY=
# MISTRAL_API_ENDPOINT=
# MISTRAL_MODEL=
## 'API Endpoints' from https://docs.mistral.ai/getting-started/models/models_overview/
# CHAT_MODEL=open-mistral-nemo
## Only set this if not using the default Mistral base URL
# MISTRAL_BASE_URL=

# Uncomment and complete if you want to use Cohere
# LLM_TYPE=cohere
## Key in https://dashboard.cohere.com/api-keys
# COHERE_API_KEY=
# COHERE_MODEL=
# CHAT_MODEL=command-r7b-12-2024
Loading
Loading