Skip to content

Commit

Permalink
Update deployment spec to use vllm
Browse files Browse the repository at this point in the history
  • Loading branch information
Scott Davidson committed Oct 26, 2023
1 parent 27ecf84 commit 715d953
Showing 1 changed file with 3 additions and 4 deletions.
7 changes: 3 additions & 4 deletions templates/api/deployment.yml
Original file line number Diff line number Diff line change
Expand Up @@ -24,9 +24,9 @@ spec:
containerPort: 80
volumeMounts:
- name: data
mountPath: /data
mountPath: /root/.cache/huggingface
args:
- --model-id
- --model
- {{ .Values.huggingface.model }}
{{- if .Values.huggingface.secretName }}
envFrom:
Expand All @@ -44,7 +44,6 @@ spec:
port: 80
initialDelaySeconds: 15
periodSeconds: 10
# TODO: Make this configurable
resources:
limits:
nvidia.com/gpu: {{ .Values.api.gpus | int }}
Expand All @@ -53,7 +52,7 @@ spec:
- name: data
# emptyDir:
hostPath:
path: /tmp/tgi/data
path: /tmp/llm/data
# Suggested in text-generation-inference docs
- name: shm
emptyDir:
Expand Down

0 comments on commit 715d953

Please sign in to comment.