Proposal: First class support for HPA #784

LucasRoesler · 2021-04-09T09:57:50Z

A frequent question that happens in Slack is questions about alternative ways to control scaling and/or instance placement in a cluster. We already have a tutorial about how to replace the default request-rate auto-scaling with metrics based scaling via HPA in Kubernetes. See openfaas/docs#164, https://docs.openfaas.com/tutorials/kubernetes-hpa/, https://docs.openfaas.com/tutorials/kubernetes-hpa-custom-metrics/

Overview

It seems that we could provider a cleaner workflow for this in faas-netes by using annotations. I think an ideal workflow would look something like this

add a new flag/env variable "hpa-scaling-enabled" to the faas-netes server
add a configuration to the Helm chart to enable this
- we could even add a pre-install hook to check for the metrics-server to ensure the cluster is ready
During a function deployment, we check the function annotations and create a corresponding HPA object targeting the function. This means we create 3 k8s object: a Service, a Deployment, and a HorizontalPodAutoscaler.
- during deletes, we of course delete all three objects

Doing this should significantly simplify these more advanced use-cases. This could be especially useful for async workflows that should be scaled on metrics like queue depth or cpu/memory usage instead of requests per second.

Details

The HPA walkthrough in the k8s docs gives an advanced example that looks like this

apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
name: php-apache
spec:
scaleTargetRef:
  apiVersion: apps/v1
  kind: Deployment
  name: php-apache
minReplicas: 1
maxReplicas: 10
metrics:
- type: Resource
  resource:
    name: cpu
    target:
      type: Utilization
      averageUtilization: 50
- type: Pods
  pods:
    metric:
      name: packets-per-second
    target:
      type: AverageValue
      averageValue: 1k
- type: Object
  object:
    metric:
      name: requests-per-second
    describedObject:
      apiVersion: networking.k8s.io/v1beta1
      kind: Ingress
      name: main-route
    target:
      type: Value
      value: 10k

The simplest implementation would support a single metric for scaling and map 1:1 to the fields, for example

- type: Resource
  resource:
    name: cpu
    target:
      type: Utilization
      averageUtilization: 50

would become these annotations

com.openfaaas.scale.min: 1
com.openfaaas.scale.max: 10
com.openfaas.scale.type: resource
com.openfaas.scale.metric: cpu
com.openfaas.scale.targetType: utilization
com.openfaas.scale.targetValue: 50

Alternatively, to support multiple metrics we could use a key=value format

prometheus.io.scrape: true
prometheus.io.port: 8081
com.openfaaas.scale.min: 1
com.openfaaas.scale.max: 10
com.openfaas.scale.metric.0: type=Resource,metric=cpu,targetType=utilization,targetValue=50
com.openfaas.scale.metric.1: type=Pods,metric=packets-per-second,targetType=AverageValue,targetValue=1k

a single rule could look like

com.openfaaas.scale.min: 1
com.openfaaas.scale.max: 10
com.openfaas.scale.metric: type=Resource,metric=cpu,targetType=utilization,targetValue=50

Supporting one or multiple is relatively easy by just sorting and filtering the annotations on the string com.openfaas.scale.metric

The serialization and keys names are of course up to debate.

A note about validation: I think we can roll this out iteratively, starting with built-in metrics and then expanding to custom metrics. When we add support for custom metrics, we should add validation for or automatically inject the prometheus scrape annoations that are required to ensure the metrics exist.

A Note about Sidecars: In k8s 1.20+ we can also create container resource metrics using ContainerResource which we can use to target just the function container. This is useful in clusters that have sidecar injection, for example clusters using a service mesh or some of the tracing deployments. We should sniff the cluster version and prefer this object instead of the more generic Resource

The text was updated successfully, but these errors were encountered:

alexellis · 2021-04-09T18:07:06Z

Hi @LucasRoesler thanks for sharing your thoughts here. This is something that Stefan had mentioned a few times in the past.

We also have some good documentation on how to configure HPAv2 (that you also mentioned). What I found in testing was that HPAv2 was more heavy-weight - relying on several extra components for custom metrics usage, and also lagged so was not as responsive for bursting workloads, it took up to 15 minutes as I remember to start scaling down.

My question is: have you seen any customer demand for this? Any comments or issues that you can reference?

LucasRoesler closed this as completed Feb 26, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposal: First class support for HPA #784

Proposal: First class support for HPA #784

LucasRoesler commented Apr 9, 2021

alexellis commented Apr 9, 2021

Proposal: First class support for HPA #784

Proposal: First class support for HPA #784

Comments

LucasRoesler commented Apr 9, 2021

Overview

Details

alexellis commented Apr 9, 2021