Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(operator): add Model selector for scale subresource to enable HPA-based scaling #5932

Merged
merged 3 commits into from
Sep 24, 2024

Conversation

lc525
Copy link
Member

@lc525 lc525 commented Sep 23, 2024

  • updates the Model CRD to contain a pod selector in the scale subresource
  • sets the selector to a label server=[inference-server-name] matching no actual pods
  • docs [to be moved to gitbook before merging]

Which issue(s) this PR fixes:
Fixes #1190 (internal): allow HPA-based Model autoscaling

Special notes for your reviewer:

  • docs added to the PR to show how HPA would work
  • tested in kind (small-scale)

TODO:

  • test availability during HPA scale-up/scale-down via k6 load test (can be done after merging)

…ed scaling

- updates the Model CRD to contain a pod selector in the scale subresource
- sets the selector to a label `server=[inference-server-name]` matching no actual pods
- docs [to be moved to gitbook before merging]
@lc525 lc525 requested a review from sakoush as a code owner September 23, 2024 17:41
@lc525 lc525 added the v2 label Sep 23, 2024
@lc525 lc525 marked this pull request as draft September 23, 2024 17:42
@lc525 lc525 changed the title add(operator): Model selector for scale subresource to enable HPA-based scaling feat(operator): add Model selector for scale subresource to enable HPA-based scaling Sep 23, 2024
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

generated

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

generated

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

generated

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

generated

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

generated

@lc525 lc525 marked this pull request as ready for review September 23, 2024 17:49
Copy link
Member

@sakoush sakoush left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@lc525 lc525 force-pushed the INFRA-1190/hpa-autoscaling-verizon branch from 0de0dc8 to 4ba04e9 Compare September 24, 2024 11:24
describe the behaviour when a Model gets scaled up slightly before its Server
@lc525 lc525 force-pushed the INFRA-1190/hpa-autoscaling-verizon branch from 4ba04e9 to d101859 Compare September 24, 2024 16:08
@lc525 lc525 merged commit 1bd8d0f into SeldonIO:v2 Sep 24, 2024
3 checks passed
@lc525 lc525 deleted the INFRA-1190/hpa-autoscaling-verizon branch September 24, 2024 16:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants