Gaps in metrics
#6322
Replies: 2 comments
-
Hey, thanks for reaching out. This can be a problem with both clustering and hashmod sharding; although it may be a little easier to see in clustering due to the fact it can scale dynamically. I've opened https://github.com/grafana/agent/issues/6333 to track a solution to this. |
Beta Was this translation helpful? Give feedback.
0 replies
-
thanks @tpaschalis |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hello,
I have a cluster of agents that scrapes a bunch of metrics via Prometheus ServiceMonitor CRD.
And I'm investigating the cause of the gaps in some of my metrics. It looks like they are happening when a new agent is starting and picking up the task to scrape these metrics or the agent who scraped these metrics died.
here is the metric that I investigated:
the gap is 5 minutes
and here is a graph where we can see that a new agent was started at that time:
There are no errors in the logs.
So, the question is, are there any configuration options or arguments that declare how the agent's cluster handles new joins or leaves? Or probably any other suggestions on how to ensure high availability for metrics?
Beta Was this translation helpful? Give feedback.
All reactions