-
-
Notifications
You must be signed in to change notification settings - Fork 5.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add example of helm chart for vllm deployment on k8s #9199
Add example of helm chart for vllm deployment on k8s #9199
Conversation
👋 Hi! Thank you for contributing to the vLLM project. Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can do one of these:
🚀 |
Thanks for the PR. Do you have any thoughts on how / where this could be tested to ensure it remains functional? |
I propose the following tests to be launched in a github workflow on every pull request :
|
That sounds good to me. Is that something you'd be willing to work on? Since the PR content is pretty standalone, it shouldn't be at much risk of going into conflict in the meantime. |
Sure, I can work on implementing these tests. |
@russellb I just added some github workflows to implement functional tests on the chart helm :
All these tests have been implemented in a github worflow. @simon-mo @khluu how can I have the rights to add these tests I have implemented in the vllm github worflows? Do you prefer I migrate them into buildkite? |
09860ab
to
8d421da
Compare
This pull request has merge conflicts that must be resolved before it can be |
8d421da
to
63f1fde
Compare
Signed-off-by: Maxime Fournioux <[email protected]>
Signed-off-by: Maxime Fournioux <[email protected]>
Signed-off-by: Maxime Fournioux <[email protected]>
Signed-off-by: Maxime Fournioux <[email protected]>
Signed-off-by: Maxime Fournioux <[email protected]>
Signed-off-by: Maxime Fournioux <[email protected]>
Signed-off-by: Maxime Fournioux <[email protected]>
Signed-off-by: Maxime Fournioux <[email protected]>
Signed-off-by: Maxime Fournioux <[email protected]>
Thanks for addressing all of my feedback! I've pinged some maintainers to take a look. |
Signed-off-by: Maxime Fournioux <[email protected]>
Signed-off-by: Maxime Fournioux <[email protected]>
Signed-off-by: Maxime Fournioux <[email protected]>
Many thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for spending the time and effort on setting this up!
You are welcome! |
Signed-off-by: Maxime Fournioux <[email protected]>
Signed-off-by: Maxime Fournioux <[email protected]>
This PR adds an example of helm chart to deploy vllm on Kubernetes cluster. The goal of this PR is to have a deployment example in order to have the best configuration for k8s.
This example implements an autonomous deployment for vllm with k8s probes (startup, readiness, liveness) which will wait for model to be fully loaded and then marks the pod with running status when the health checkpoint return 200.
As shown in the figure in readme file, the deployment follows two steps :
Step 1 : Load the model from an S3 to a volume. An init container is launched and waits for a job to load the model from an S3 to a volume
Step 2 : Launch VLLM engine. Once the model is loaded on the volume, Vllm is launched.
This deployment will launch two containers :
The init containers which will be marked with completed status once the download job is done.
A containers hosting vllm engine which will be marked with pending status when init container is ongoing, and will be marked with running status once init container is completed.
FIX #6073