Skip to content

Commit

Permalink
Update deploying_with_k8s.rst (vllm-project#10922)
Browse files Browse the repository at this point in the history
  • Loading branch information
AlexHe99 authored Dec 16, 2024
1 parent 25ebed2 commit da6f409
Showing 1 changed file with 2 additions and 2 deletions.
4 changes: 2 additions & 2 deletions docs/source/serving/deploying_with_k8s.rst
Original file line number Diff line number Diff line change
Expand Up @@ -162,7 +162,7 @@ To test the deployment, run the following ``curl`` command:
curl http://mistral-7b.default.svc.cluster.local/v1/completions \
-H "Content-Type: application/json" \
-d '{
"model": "facebook/opt-125m",
"model": "mistralai/Mistral-7B-Instruct-v0.3",
"prompt": "San Francisco is a",
"max_tokens": 7,
"temperature": 0
Expand All @@ -172,4 +172,4 @@ If the service is correctly deployed, you should receive a response from the vLL

Conclusion
----------
Deploying vLLM with Kubernetes allows for efficient scaling and management of ML models leveraging GPU resources. By following the steps outlined above, you should be able to set up and test a vLLM deployment within your Kubernetes cluster. If you encounter any issues or have suggestions, please feel free to contribute to the documentation.
Deploying vLLM with Kubernetes allows for efficient scaling and management of ML models leveraging GPU resources. By following the steps outlined above, you should be able to set up and test a vLLM deployment within your Kubernetes cluster. If you encounter any issues or have suggestions, please feel free to contribute to the documentation.

0 comments on commit da6f409

Please sign in to comment.