You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
Currently running the generated manifest from helm through a tool like kube-score, results in the following error:
apps/v1/Deployment vault-agent-injector in my-namespace 💥
[CRITICAL] Pod Probes
· Container has the same readiness and liveness probe
Using the same probe for liveness and readiness is very likely
dangerous. Generally it's better to avoid the livenessProbe than
re-using the readinessProbe.
More information: https://github.com/zegl/kube-score/blob/master/README_PROBES.md
Given the vault-agent-injector is already running as PID 1, a better option for liveness check would be to rely on the default behaviour of k8s: Restart the container if the PID 1 has exited.
Describe the solution you'd like
Ability to configure the readiness and liveness probe. We could use a similar way of how things are supported for vault server:r:
# Used to define custom readinessProbe settings
readinessProbe:
enabled: true
# If you need to use a http path instead of the default exec
path: /v1/sys/health?standbyok=true
# When a probe fails, Kubernetes will try failureThreshold times before giving up
failureThreshold: 2
# Number of seconds after the container has started before probe initiates
initialDelaySeconds: 5
# How often (in seconds) to perform the probe
periodSeconds: 5
# Minimum consecutive successes for the probe to be considered successful after having failed
successThreshold: 1
# Number of seconds after which the probe times out.
timeoutSeconds: 3
# Used to enable a livenessProbe for the pods
livenessProbe:
enabled: false
path: "/v1/sys/health?standbyok=true"
# When a probe fails, Kubernetes will try failureThreshold times before giving up
failureThreshold: 2
# Number of seconds after the container has started before probe initiates
initialDelaySeconds: 60
# How often (in seconds) to perform the probe
periodSeconds: 5
# Minimum consecutive successes for the probe to be considered successful after having failed
successThreshold: 1
# Number of seconds after which the probe times out.
timeoutSeconds: 3
No additional context, but can provide if something is required. Also, liveness and readiness probe can be similar in some situations and there isn't a exact need to follow kube-score recommendations. But if that's the case, I would like to understand the reasoning behind it.
The text was updated successfully, but these errors were encountered:
I would recommend not setting an HTTP based readiness/liveness check for vault agents. Currently I think this endpoint just forwards to the Vault Server. So if the Vault Server goes down the Vault Agents will crash.
The Vault Agents should be resilient to failures from the Vault Server so that operations continue despite Vault Server failures.
Since the Vault Agent doesn't serve HTTP traffic liveness/readiness checks aren't really important. A potentially more appropriate solution is to exit if template rendering fails too many times: vault.hashicorp.com/template-config-exit-on-retry-failure.
You should be able to identify failures with this, and if the problem goes unfixed the entire pod should fail eventually.
If you want to run the Vault Agent as a central HTTP Caching proxy (which is really quite useful, but outside the scope of the injector); the pod is really "ready" as soon as it comes online. Vault requests will just occasionally fail.
I think the Vault Agent HTTP Caching Proxy is potentially so simple that you just monitor/manage either end of the transaction and if either end has a problem you figure it out.
👉 Forgot about TCP readiness checks! HTTP Proxy it's good to just check if the port is open. Still not sure there's a great one for vault agent configured to render templates.
How would you handle the case where the vault-container is stopped/crashed/removed and the secret is also removed from the application container?
A probe on the application side will only restart the application container but not the whole deployment.
Is your feature request related to a problem? Please describe.
Currently running the generated manifest from helm through a tool like kube-score, results in the following error:
Given the
vault-agent-injector
is already running as PID 1, a better option for liveness check would be to rely on the default behaviour of k8s: Restart the container if the PID 1 has exited.Describe the solution you'd like
Ability to configure the readiness and liveness probe. We could use a similar way of how things are supported for vault server:r:
Describe alternatives you've considered
There are various recommendations from kube-score itself regarding the alternative solutions for the liveness probes in general: https://github.com/zegl/kube-score/blob/master/README_PROBES.md#livenessprobe
Additional context
No additional context, but can provide if something is required. Also, liveness and readiness probe can be similar in some situations and there isn't a exact need to follow kube-score recommendations. But if that's the case, I would like to understand the reasoning behind it.
The text was updated successfully, but these errors were encountered: