Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VSO controller_resource_status metrics not updating #932

Open
luizrojo opened this issue Sep 24, 2024 · 0 comments
Open

VSO controller_resource_status metrics not updating #932

luizrojo opened this issue Sep 24, 2024 · 0 comments
Labels
bug Something isn't working

Comments

@luizrojo
Copy link

Describe the bug

The controller_resource_status metrics, which contain metrics on the auth and connections status from the VSO to a Vault instance do not get updated.

Metrics for controller="vaultconnection" and controller="vaultauth" experience this issue.

From the observed behavior, the metric only gets updated when going from a failure state to a working state, the other way around doesn't happen.

To Reproduce

Steps to reproduce the behavior:

  1. Deploy the Vault Secrets Operator to the Kubernetes cluster with proper connection configurations in place, like network policies;
  2. Wait for the VSO to start, run the metric related checks and start exposing the metrics;
  3. Validate that the controller_resource_status metrics show a 1 value, meaning that the VSO was able to connect to Vault;
  4. Drop the network policies (or network connectivity) to Vault;
  5. Watch the metrics, and see that they are not being update to a 0 value, meaning that the connection is failing.

Here is a screenshot of a dashboard to visualize the behavior

Screenshot 2024-09-24 at 11 17 47

Screenshot 2024-09-24 at 11 27 01

Application deployment:

There is no application involved in this case, since this is a VSO issue.

Expected behavior

When the connectivity to Vault becomes unavailable, the metrics should be updated to show the actual status.

Environment

  • Kubernetes version:
    • Distribution or cloud vendor (OpenShift, EKS, GKE, AKS, etc.): RKE v1.29.6+rke2r1
    • Other configuration options or runtime services (istio, etc.): Cilium
  • vault-secrets-operator version: 0.7.1

Additional context

We are moving to the latest available version at this moment (0.8.1), but there are no references on the changelog for metrics or observability that would indicate this being fixed or improved.

@luizrojo luizrojo added the bug Something isn't working label Sep 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant