Add example of helm chart for vllm deployment on k8s #9199

mfournioux · 2024-10-09T16:20:34Z

This PR adds an example of helm chart to deploy vllm on Kubernetes cluster. The goal of this PR is to have a deployment example in order to have the best configuration for k8s.

This example implements an autonomous deployment for vllm with k8s probes (startup, readiness, liveness) which will wait for model to be fully loaded and then marks the pod with running status when the health checkpoint return 200.

As shown in the figure in readme file, the deployment follows two steps :

Step 1 : Load the model from an S3 to a volume. An init container is launched and waits for a job to load the model from an S3 to a volume
Step 2 : Launch VLLM engine. Once the model is loaded on the volume, Vllm is launched.

This deployment will launch two containers :

The init containers which will be marked with completed status once the download job is done.
A containers hosting vllm engine which will be marked with pending status when init container is ongoing, and will be marked with running status once init container is completed.

FIX #6073

github-actions · 2024-10-09T16:20:45Z

👋 Hi! Thank you for contributing to the vLLM project.
Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can do one of these:

Add ready label to the PR
Enable auto-merge.

🚀

russellb · 2024-10-09T18:42:08Z

Thanks for the PR. Do you have any thoughts on how / where this could be tested to ensure it remains functional?

mfournioux · 2024-10-10T12:33:10Z

Thanks for the PR. Do you have any thoughts on how / where this could be tested to ensure it remains functional?

I propose the following tests to be launched in a github workflow on every pull request :

Lint helm chart
Validate Kubernetes Manifests with Kubeconform
Create kind cluster in order to have a local Kubernetes cluster using Docker for test environment
Test helm install
Test the service with a post request and ensure we have a 200 response.

russellb · 2024-10-10T13:19:28Z

Thanks for the PR. Do you have any thoughts on how / where this could be tested to ensure it remains functional?

I propose the following tests to be launched in a github workflow on every pull request :

Lint helm chart

Validate Kubernetes Manifests with Kubeconform

Create kind cluster in order to have a local Kubernetes cluster using Docker for test environment

Test helm install

Test the service with a post request and ensure we have a 200 response.

That sounds good to me. Is that something you'd be willing to work on? Since the PR content is pretty standalone, it shouldn't be at much risk of going into conflict in the meantime.

mfournioux · 2024-10-10T14:24:30Z

Thanks for the PR. Do you have any thoughts on how / where this could be tested to ensure it remains functional?

I propose the following tests to be launched in a github workflow on every pull request :

Lint helm chart

Validate Kubernetes Manifests with Kubeconform

Create kind cluster in order to have a local Kubernetes cluster using Docker for test environment

Test helm install

Test the service with a post request and ensure we have a 200 response.

That sounds good to me. Is that something you'd be willing to work on? Since the PR content is pretty standalone, it shouldn't be at much risk of going into conflict in the meantime.

Sure, I can work on implementing these tests.

mfournioux · 2024-11-20T09:25:23Z

@russellb I just added some github workflows to implement functional tests on the chart helm :

lint test of the chart
Input Validation with values.schema.json
setup a local minio to mock an s3 and download the model opt-125m for test
setup a local k8s cluster with kind
deploy the chart helm on the kind cluster (cpu only, I have used the cpu vllm docker image)
curl test to check if the service is correctly responding

All these tests have been implemented in a github worflow.

@simon-mo @khluu how can I have the rights to add these tests I have implemented in the vllm github worflows? Do you prefer I migrate them into buildkite?

mergify · 2024-11-20T11:05:04Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @mfournioux.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

Signed-off-by: Maxime Fournioux <[email protected]>

russellb · 2024-12-04T20:07:31Z

Thanks for addressing all of my feedback! I've pinged some maintainers to take a look.

Signed-off-by: Maxime Fournioux <[email protected]>

mfournioux · 2024-12-05T09:29:04Z

Thanks for addressing all of my feedback! I've pinged some maintainers to take a look.

Many thanks!

DarkLight1337

Thanks for spending the time and effort on setting this up!

mfournioux · 2024-12-10T14:13:19Z

Thanks for spending the time and effort on setting this up!

You are welcome!

Signed-off-by: Maxime Fournioux <[email protected]>

mfournioux mentioned this pull request Oct 9, 2024

[Frontend] Add readiness and liveness endpoints to OpenAI API server #7078

Closed

mfournioux changed the title ~~Add example of chart helm for vllm deployment on k8s~~ Add example of helm chart for vllm deployment on k8s Oct 10, 2024

mfournioux force-pushed the add_chart_helm_example branch from 09860ab to 8d421da Compare November 20, 2024 11:04

mfournioux requested review from mgoin, youkaichao, alexm-neuralmagic, comaniac, simon-mo, robertgshaw2-neuralmagic, tlrmchlsmth, WoosukKwon, njhill, LiuXiaoxuanPKU, KuntaiDu, DarkLight1337, ywang96 and zhuohan123 as code owners November 20, 2024 11:04

mergify bot added documentation Improvements or additions to documentation ci/build frontend labels Nov 20, 2024

mergify bot added the needs-rebase label Nov 20, 2024

mfournioux closed this Nov 20, 2024

mfournioux force-pushed the add_chart_helm_example branch from 8d421da to 63f1fde Compare November 20, 2024 11:17

mfournioux added 8 commits December 4, 2024 16:29

update documentation

499cc39

Signed-off-by: Maxime Fournioux <[email protected]>

update documentation

b422f09

Signed-off-by: Maxime Fournioux <[email protected]>

update documentation

610e1d8

Signed-off-by: Maxime Fournioux <[email protected]>

update documentation

002ed4d

Signed-off-by: Maxime Fournioux <[email protected]>

update documentation

0002a31

Signed-off-by: Maxime Fournioux <[email protected]>

update documentation

d98e8c4

Signed-off-by: Maxime Fournioux <[email protected]>

update documentation

93562cf

Signed-off-by: Maxime Fournioux <[email protected]>

correct malformed table error on rst file

8479f9f

Signed-off-by: Maxime Fournioux <[email protected]>

mfournioux requested a review from russellb December 4, 2024 19:01

rename image file used in rst file

3914276

Signed-off-by: Maxime Fournioux <[email protected]>

mfournioux added 4 commits December 5, 2024 08:01

Merge branch 'vllm-project:main' into add_chart_helm_example

9cc1fc2

correct malformed table error on rst file

58c7f96

Signed-off-by: Maxime Fournioux <[email protected]>

correct malformed table error on rst file

435d4dd

Signed-off-by: Maxime Fournioux <[email protected]>

correct malformed table error on rst file

40e481a

Signed-off-by: Maxime Fournioux <[email protected]>

mfournioux added 6 commits December 5, 2024 14:32

Merge branch 'vllm-project:main' into add_chart_helm_example

9fc0593

Merge branch 'vllm-project:main' into add_chart_helm_example

b270d3c

Merge branch 'vllm-project:main' into add_chart_helm_example

a65aa6b

Merge branch 'vllm-project:main' into add_chart_helm_example

1634fec

Merge branch 'vllm-project:main' into add_chart_helm_example

0fe290d

Merge branch 'vllm-project:main' into add_chart_helm_example

77f6675

DarkLight1337 approved these changes Dec 10, 2024

View reviewed changes

DarkLight1337 enabled auto-merge (squash) December 10, 2024 04:18

github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Dec 10, 2024

Fix doc format

911f78e

DarkLight1337 merged commit fe2e10c into vllm-project:main Dec 10, 2024
33 of 34 checks passed

sleepwalker2017 pushed a commit to sleepwalker2017/vllm that referenced this pull request Dec 13, 2024

Add example of helm chart for vllm deployment on k8s (vllm-project#9199)

ba4d49c

Signed-off-by: Maxime Fournioux <[email protected]>

BKitor pushed a commit to BKitor/vllm that referenced this pull request Dec 30, 2024

Add example of helm chart for vllm deployment on k8s (vllm-project#9199)

426afb2

Signed-off-by: Maxime Fournioux <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add example of helm chart for vllm deployment on k8s #9199

Add example of helm chart for vllm deployment on k8s #9199

mfournioux commented Oct 9, 2024 •

edited by github-actions bot

Loading

github-actions bot commented Oct 9, 2024

russellb commented Oct 9, 2024

mfournioux commented Oct 10, 2024 •

edited

Loading

russellb commented Oct 10, 2024

mfournioux commented Oct 10, 2024

mfournioux commented Nov 20, 2024 •

edited

Loading

mergify bot commented Nov 20, 2024

russellb commented Dec 4, 2024

mfournioux commented Dec 5, 2024

DarkLight1337 left a comment

mfournioux commented Dec 10, 2024

Add example of helm chart for vllm deployment on k8s #9199

Add example of helm chart for vllm deployment on k8s #9199

Conversation

mfournioux commented Oct 9, 2024 • edited by github-actions bot Loading

github-actions bot commented Oct 9, 2024

russellb commented Oct 9, 2024

mfournioux commented Oct 10, 2024 • edited Loading

russellb commented Oct 10, 2024

mfournioux commented Oct 10, 2024

mfournioux commented Nov 20, 2024 • edited Loading

mergify bot commented Nov 20, 2024

russellb commented Dec 4, 2024

mfournioux commented Dec 5, 2024

DarkLight1337 left a comment

Choose a reason for hiding this comment

mfournioux commented Dec 10, 2024

mfournioux commented Oct 9, 2024 •

edited by github-actions bot

Loading

mfournioux commented Oct 10, 2024 •

edited

Loading

mfournioux commented Nov 20, 2024 •

edited

Loading