New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

feat(operator): add Model selector for scale subresource to enable HPA-based scaling #5932

Merged

lc525 merged 3 commits into SeldonIO:v2 from lc525:INFRA-1190/hpa-autoscaling-verizon

Sep 24, 2024

Member

lc525 commented Sep 23, 2024 •

edited

Loading

updates the Model CRD to contain a pod selector in the scale subresource
sets the selector to a label server=[inference-server-name] matching no actual pods
docs [to be moved to gitbook before merging]

Which issue(s) this PR fixes:
Fixes #1190 (internal): allow HPA-based Model autoscaling

Special notes for your reviewer:

docs added to the PR to show how HPA would work
tested in kind (small-scale)

TODO:

test availability during HPA scale-up/scale-down via k6 load test (can be done after merging)


          add(operator): Model selector for scale subresource to enable HPA-bas…

c930b00

…ed scaling

- updates the Model CRD to contain a pod selector in the scale subresource
- sets the selector to a label `server=[inference-server-name]` matching no actual pods
- docs [to be moved to gitbook before merging]

lc525 requested a review from sakoush as a code owner

September 23, 2024 17:41

lc525 added the v2 label

lc525 marked this pull request as draft

September 23, 2024 17:42

lc525 changed the title ~~add(operator): Model selector for scale subresource to enable HPA-based scaling~~ feat(operator): add Model selector for scale subresource to enable HPA-based scaling

lc525 commented

View reviewed changes

k8s/helm-charts/seldon-core-v2-crds/templates/seldon-v2-crds.yaml

Member Author

lc525 Sep 23, 2024

generated

lc525 commented

View reviewed changes

k8s/yaml/crds.yaml

Member Author

lc525 Sep 23, 2024

generated

lc525 commented

View reviewed changes

k8s/yaml/runtime.yaml

Member Author

lc525 Sep 23, 2024

generated

lc525 commented

View reviewed changes

k8s/yaml/servers.yaml

Member Author

lc525 Sep 23, 2024

generated

lc525 commented

View reviewed changes

operator/config/crd/bases/mlops.seldon.io_models.yaml

Member Author

lc525 Sep 23, 2024

generated

lc525 marked this pull request as ready for review

September 23, 2024 17:49


          improve docs

a2b6087

sakoush approved these changes

View reviewed changes

Member

sakoush left a comment

LGTM

lc525 force-pushed the INFRA-1190/hpa-autoscaling-verizon branch from 0de0dc8 to 4ba04e9 Compare

September 24, 2024 11:24


          improve docs

d101859

describe the behaviour when a Model gets scaled up slightly before its Server

lc525 force-pushed the INFRA-1190/hpa-autoscaling-verizon branch from 4ba04e9 to d101859 Compare

September 24, 2024 16:08

lc525 merged commit 1bd8d0f into SeldonIO:v2

3 checks passed

lc525 deleted the INFRA-1190/hpa-autoscaling-verizon branch

September 24, 2024 16:29

lc525 mentioned this pull request

feat(docs): add documentation for HPA-based autoscaling #5935

Merged

sakoush mentioned this pull request

fix: Make status.selector field optional #5985

Merged

1 task

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

v2