Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Grit prod #4

Draft
wants to merge 12 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .dockerignore
Original file line number Diff line number Diff line change
Expand Up @@ -3,3 +3,4 @@ cookbook
.circleci
.github
tests
.env
11 changes: 9 additions & 2 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ RUN pip install dist/*.whl
# install dependencies as wheels
RUN pip wheel --no-cache-dir --wheel-dir=/wheels/ -r requirements.txt

# install semantic-cache [Experimental]- we need this here and not in requirements.txt because redisvl pins to pydantic 1.0
# install semantic-cache [Experimental]- we need this here and not in requirements.txt because redisvl pins to pydantic 1.0
RUN pip install redisvl==0.0.7 --no-deps

# ensure pyjwt is used, not jwt
Expand All @@ -61,13 +61,20 @@ COPY --from=builder /wheels/ /wheels/
# Install the built wheel using pip; again using a wildcard if it's the only file
RUN pip install *.whl /wheels/* --no-index --find-links=/wheels/ && rm -f *.whl && rm -rf /wheels

# copy in our custom requirements
COPY custom/requirements.txt /app/custom/requirements.txt
RUN pip install -r custom/requirements.txt

# Generate prisma client
RUN prisma generate
RUN chmod +x entrypoint.sh

EXPOSE 4000/tcp

# Copy in our config
COPY custom/config.yaml /app/config.yaml

ENTRYPOINT ["litellm"]

# Append "--detailed_debug" to the end of CMD to view detailed debug logs
# Append "--detailed_debug" to the end of CMD to view detailed debug logs
CMD ["--port", "4000"]
62 changes: 62 additions & 0 deletions custom/config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
model_list:
- model_name: gpt-3.5-turbo
litellm_params:
model: gpt-3.5-turbo
api_key: os.environ/OPENAI_API_KEY
- model_name: gpt-4
litellm_params:
model: gpt-4
api_key: os.environ/OPENAI_API_KEY
- model_name: gpt-4-turbo-preview
litellm_params:
model: gpt-4-turbo-preview
api_key: os.environ/OPENAI_API_KEY
- model_name: gpt-4-0314
litellm_params:
model: gpt-4-0314
api_key: os.environ/OPENAI_API_KEY
- model_name: gpt-3.5-turbo-0301
litellm_params:
model: gpt-3.5-turbo-0301
api_key: os.environ/OPENAI_API_KEY
- model_name: gpt-3.5-turbo-16k
litellm_params:
model: gpt-3.5-turbo-16k
api_key: os.environ/OPENAI_API_KEY
# Anthropic
- model_name: claude-3-5-sonnet-20240620
litellm_params:
model: claude-3-5-sonnet-20240620
api_key: "os.environ/ANTHROPIC_API_KEY"
- model_name: claude-3-opus-20240229
litellm_params:
model: claude-3-opus-20240229
api_key: "os.environ/ANTHROPIC_API_KEY"
- model_name: claude-3-sonnet-20240229
litellm_params:
model: claude-3-sonnet-20240229
api_key: "os.environ/ANTHROPIC_API_KEY"
- model_name: claude-3-haiku-20240307
litellm_params:
model: claude-3-haiku-20240307
api_key: "os.environ/ANTHROPIC_API_KEY"
- model_name: claude-2
litellm_params:
model: claude-2
api_key: "os.environ/ANTHROPIC_API_KEY"
- model_name: claude-2.0
litellm_params:
model: claude-2
api_key: "os.environ/ANTHROPIC_API_KEY"
litellm_settings:
drop_params: True
otel: True
set_verbose: True
cache: True
cache_params:
type: "redis"
host: os.environ/REDIS_HOST
port: os.environ/REDIS_PORT
password: os.environ/REDIS_PASSWORD
# success_callback: ["traceloop"]
callbacks: ["otel"]
48 changes: 48 additions & 0 deletions custom/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
# STUPID SHIT #
traceloop-sdk==0.18.2
# STOP MAKING SDKS #

# LITELLM PROXY DEPENDENCIES #
anyio==4.2.0 # openai + http req.
openai==1.27.0 # openai req.
fastapi==0.111.0 # server dep
backoff==2.2.1 # server dep
pyyaml==6.0.0 # server dep
uvicorn==0.29.0 # server dep
gunicorn==22.0.0 # server dep
boto3==1.34.34 # aws bedrock/sagemaker calls
redis==5.0.0 # caching
numpy==1.24.3 # semantic caching
pandas==2.1.1 # for viewing clickhouse spend analytics
prisma==0.11.0 # for db
mangum==0.17.0 # for aws lambda functions
pynacl==1.5.0 # for encrypting keys
google-cloud-aiplatform==1.47.0 # for vertex ai calls
anthropic[vertex]==0.21.3
google-generativeai==0.5.0 # for vertex ai calls
async_generator==1.10.0 # for async ollama calls
langfuse==2.27.1 # for langfuse self-hosted logging
datadog-api-client==2.23.0 # for datadog logging
prometheus_client==0.20.0 # for /metrics endpoint on proxy
orjson==3.9.15 # fast /embedding responses
apscheduler==3.10.4 # for resetting budget in background
fastapi-sso==0.10.0 # admin UI, SSO
pyjwt[crypto]==2.8.0
python-multipart==0.0.9 # admin UI
Pillow==10.3.0
azure-ai-contentsafety==1.0.0 # for azure content safety
azure-identity==1.15.0 # for azure content safety

### LITELLM PACKAGE DEPENDENCIES
python-dotenv==1.0.0 # for env
tiktoken==0.6.0 # for calculating usage
importlib-metadata==6.8.0 # for random utils
tokenizers==0.14.0 # for calculating usage
click==8.1.7 # for proxy cli
jinja2==3.1.4 # for prompt templates
certifi==2023.7.22 # [TODO] clean up
aiohttp==3.9.0 # for network calls
aioboto3==12.3.0 # for async sagemaker calls
tenacity==8.2.3 # for retrying requests, when litellm.num_retries set
pydantic==2.7.1 # openai req.
####
29 changes: 9 additions & 20 deletions docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,25 +5,14 @@ services:
context: .
args:
target: runtime
image: ghcr.io/berriai/litellm:main-stable
ports:
- "4000:4000" # Map the container port to the host, change the host port if necessary
environment:
DATABASE_URL: "postgresql://postgres:example@db:5432/postgres"
STORE_MODEL_IN_DB: "True" # allows adding models to proxy via UI
- "9200:4000" # Map the container port to the host, change the host port if necessary
env_file:
- .env # Load local .env file


db:
image: postgres
restart: always
environment:
POSTGRES_PASSWORD: example
healthcheck:
test: ["CMD-SHELL", "pg_isready"]
interval: 1s
timeout: 5s
retries: 10

# ...rest of your docker-compose config if any
- .env
volumes:
- ./custom/config.yaml:/app/config.yaml
# - ./litellm-config.yaml:/app/config.yaml # Mount the local configuration file
# You can change the port or number of workers as per your requirements or pass any new supported CLI augument. Make sure the port passed here matches with the container port defined above in `ports` value
command:
["--config", "/app/config.yaml", "--port", "4000", "--num_workers", "8"]
# ...rest of your docker-compose config if any
3 changes: 3 additions & 0 deletions litellm/proxy/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
opentelemetry-api
opentelemetry-sdk
opentelemetry-exporter-otlp
Loading