diff --git a/_posts/2024-11-26-in-kubernetes-we-trust.adoc b/_posts/2024-11-26-in-kubernetes-we-trust.adoc index ecab3c9..1fc8b13 100644 --- a/_posts/2024-11-26-in-kubernetes-we-trust.adoc +++ b/_posts/2024-11-26-in-kubernetes-we-trust.adoc @@ -77,7 +77,7 @@ Although autoscaling is a powerful Kubernetes feature, you cannot always fall ba === CPU Limits -Setting CPU Limits in general is a contended topic for production workloads, since If you apply them the workloads are throttled by definition. Limits for CPU for soft-tenancy pods are probably not going to be helpful unless you are approaching very dense setups (> 10 pods per core) - otherwise, you will waste more CPU throttling than you save. CPU Limits definitely increase tail latencies for most non-predictable workloads (almost all request-driven use cases) in a way that will result in a worse overall application environment for most users most of the time (because of how limits are sliced). At lower pods per core, you are almost certainly trading a false security for a worse quality of service for the workloads you are running on Kubernetes. +Setting CPU Limits in general is a contended topic for production workloads, since If you apply them the workloads are throttled by definition. Limits for CPU for soft-tenancy pods are probably not going to be helpful unless you are approaching very dense setups (> 10 pods per core) - otherwise, you will waste more CPU by throttling than you save. CPU Limits definitely increase tail latencies for most non-predictable workloads (almost all request-driven use cases) in a way that will result in a worse overall application environment for most users most of the time (because of how limits are sliced). At lower pods per core, you are almost certainly trading a false security for a worse quality of service for the workloads you are running on Kubernetes. CPU Limits are most useful when dealing with bad actors on your own platform, and even then, there are far more effective ways of dealing with bad actors like detection and account blocking. However, in the case of CDEs, you may consider applying the limits on the namespace level to prevent developers from accidentally saturating a compute node. If you apply limits, you must make sure the limits are high enough to allow normal bursts of CPU usage during the inner-loop activities. Otherwise, developers may experience unexpected performance issues during CPU-intensive activities.