Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add metrics for providers #273

Open
pathcl opened this issue Jun 29, 2024 · 1 comment
Open

add metrics for providers #273

pathcl opened this issue Jun 29, 2024 · 1 comment
Milestone

Comments

@pathcl
Copy link

pathcl commented Jun 29, 2024

We'd like to understand more about runner's && providers.

We have metrics for the GH API calls, but no metrics for provider calls. We don't currently see if a runner just failed to reach idle state and is just recreated over and over due to the bootstrap timeout.

Let's try to add metrics for provider calls.

@gabriel-samfira gabriel-samfira added this to the v0.1.6 milestone Jul 1, 2024
@bavarianbidi
Copy link
Contributor

bavarianbidi commented Jul 22, 2024

Hi @pathcl with #217 i've also introduced metrics for the runner package (documentation: https://github.com/cloudbase/garm/blob/main/doc/config_metrics.md#runner-metrics)

we are already running a patched version of v0.1.4 where we cherry-picked some of the changes (and #217 is in there) we wanted on our side. (feel free to build our patched garm-version by your own and give them a try - all patches are already part of main branch in garm itself)

Out of curiosity: do you want to have more (from a metrics point of view) metrics or is this exactly what you are looking for?

image

promql-query:

    (
        sum by (operation, provider) (
          rate(
            garm_runner_errors_total{app_kubernetes_io_instance="garm-prod",app_kubernetes_io_name="garm"}[5m]
          )
        )
      or
        sum by (operation, provider) (
            garm_runner_operations_total{app_kubernetes_io_instance="garm-prod",app_kubernetes_io_name="garm"}
          *
            0
        )
    )
  /
    sum by (operation, provider) (
      rate(
        garm_runner_operations_total{app_kubernetes_io_instance="garm-prod",app_kubernetes_io_name="garm"}[5m]
      )
    )
*
  100

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants