-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prometheus-operator stackdriver sidecar sharding events #233
Comments
I strongly suspect this is due to particular data points causing an unrecoverable error that looks recoverable. This requires some kind of never-succeeding request to explain, but the sidecar logic absolutely can fall into a permanent retry loop and block the WAL reader when this happens. Documented in the downstream repository lightstep/opentelemetry-prometheus-sidecar#88 also partly mitigated: https://github.com/lightstep/opentelemetry-prometheus-sidecar/pulls/87 |
This is the function that never returns:
|
I see the same behaviour with the same messages from the stack-driver sidecar
|
I am using service monitor k8s resources to add targets to Prometheus.
I keep receiving metrics in Stackdriver from the sidecar until I add a service monitor to my k8s cluster that adds 220 targets to my prometheus, once the targets come up ALL metrics in stackdriver stop at the same time and no new metric values appear in Stackdriver. Based on the sidecar container logs shard calculation takes place :
This keeps going for hours and hours but the metrics do not return to Stackdriver.
Could you please help in understanding the sharding?
Additionally, how could I speed up the process?
Thanks
The text was updated successfully, but these errors were encountered: