Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: First class support for HPA #784

Closed
LucasRoesler opened this issue Apr 9, 2021 · 1 comment
Closed

Proposal: First class support for HPA #784

LucasRoesler opened this issue Apr 9, 2021 · 1 comment

Comments

@LucasRoesler
Copy link
Member

A frequent question that happens in Slack is questions about alternative ways to control scaling and/or instance placement in a cluster. We already have a tutorial about how to replace the default request-rate auto-scaling with metrics based scaling via HPA in Kubernetes. See openfaas/docs#164, https://docs.openfaas.com/tutorials/kubernetes-hpa/, https://docs.openfaas.com/tutorials/kubernetes-hpa-custom-metrics/

Overview

It seems that we could provider a cleaner workflow for this in faas-netes by using annotations. I think an ideal workflow would look something like this

  1. add a new flag/env variable "hpa-scaling-enabled" to the faas-netes server
  2. add a configuration to the Helm chart to enable this
    • we could even add a pre-install hook to check for the metrics-server to ensure the cluster is ready
  3. During a function deployment, we check the function annotations and create a corresponding HPA object targeting the function. This means we create 3 k8s object: a Service, a Deployment, and a HorizontalPodAutoscaler.
    • during deletes, we of course delete all three objects

Doing this should significantly simplify these more advanced use-cases. This could be especially useful for async workflows that should be scaled on metrics like queue depth or cpu/memory usage instead of requests per second.

Details

The HPA walkthrough in the k8s docs gives an advanced example that looks like this

apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
name: php-apache
spec:
scaleTargetRef:
  apiVersion: apps/v1
  kind: Deployment
  name: php-apache
minReplicas: 1
maxReplicas: 10
metrics:
- type: Resource
  resource:
    name: cpu
    target:
      type: Utilization
      averageUtilization: 50
- type: Pods
  pods:
    metric:
      name: packets-per-second
    target:
      type: AverageValue
      averageValue: 1k
- type: Object
  object:
    metric:
      name: requests-per-second
    describedObject:
      apiVersion: networking.k8s.io/v1beta1
      kind: Ingress
      name: main-route
    target:
      type: Value
      value: 10k

The simplest implementation would support a single metric for scaling and map 1:1 to the fields, for example

- type: Resource
  resource:
    name: cpu
    target:
      type: Utilization
      averageUtilization: 50

would become these annotations

com.openfaaas.scale.min: 1
com.openfaaas.scale.max: 10
com.openfaas.scale.type: resource
com.openfaas.scale.metric: cpu
com.openfaas.scale.targetType: utilization
com.openfaas.scale.targetValue: 50

Alternatively, to support multiple metrics we could use a key=value format

prometheus.io.scrape: true
prometheus.io.port: 8081
com.openfaaas.scale.min: 1
com.openfaaas.scale.max: 10
com.openfaas.scale.metric.0: type=Resource,metric=cpu,targetType=utilization,targetValue=50
com.openfaas.scale.metric.1: type=Pods,metric=packets-per-second,targetType=AverageValue,targetValue=1k

a single rule could look like

com.openfaaas.scale.min: 1
com.openfaaas.scale.max: 10
com.openfaas.scale.metric: type=Resource,metric=cpu,targetType=utilization,targetValue=50

Supporting one or multiple is relatively easy by just sorting and filtering the annotations on the string com.openfaas.scale.metric

The serialization and keys names are of course up to debate.

A note about validation: I think we can roll this out iteratively, starting with built-in metrics and then expanding to custom metrics. When we add support for custom metrics, we should add validation for or automatically inject the prometheus scrape annoations that are required to ensure the metrics exist.

A Note about Sidecars: In k8s 1.20+ we can also create container resource metrics using ContainerResource which we can use to target just the function container. This is useful in clusters that have sidecar injection, for example clusters using a service mesh or some of the tracing deployments. We should sniff the cluster version and prefer this object instead of the more generic Resource

@alexellis
Copy link
Member

Hi @LucasRoesler thanks for sharing your thoughts here. This is something that Stefan had mentioned a few times in the past.

We also have some good documentation on how to configure HPAv2 (that you also mentioned). What I found in testing was that HPAv2 was more heavy-weight - relying on several extra components for custom metrics usage, and also lagged so was not as responsive for bursting workloads, it took up to 15 minutes as I remember to start scaling down.

My question is: have you seen any customer demand for this? Any comments or issues that you can reference?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants