-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: throw away samples on Prometheus Remote Write 4xx errors #15
Conversation
samples, _, _ = metricsLog.GetSamples() | ||
if len(samples) != 0 { | ||
t.Fatalf("expected all samples to be pruned") | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you like to add a test for RecoverableError
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test case added.
This change adds concept of Recoverable and Non-Recoverable error in Notifier interface. Deamon throws away the samples on non-recoverable error. PrometheusNotifier was modified to produce Non-recoverable error on 4xx http error family and on unknown http errors. This should be in complience with prometheus remote write spec as the client should not retry on 4xx except on 429. This also means that host-metering will throw away samples from time when - it was misconfigured with wrong URL (404) - auth issue happened (403) It doesn't throw away samples on connectivity issues (no response from server). Signed-off-by: Petr Vobornik <[email protected]>
5d0643e
to
342a8ad
Compare
Merged, thanks @pvoborni ! |
# 1.0.0 (2023-11-07) ### Bug Fixes * add external_organization label ([92489c2](92489c2)) * change prometheus remote write url ([f2131db](f2131db)) * Count the number of vCPUs instead of the number of CPU cores ([7af0f82](7af0f82)) * crash on failed cert watcher creation ([b514d48](b514d48)) * crash when host cert is not available ([e48c1de](e48c1de)) * do not try to notify if required data are missing ([#14](#14)) ([8620adf](8620adf)) * don't print success on http errors ([1e3d917](1e3d917)) * Don't special-case first index in the metrics log ([4311117](4311117)) * double call of `subscription-manager identity` ([#17](#17)) ([e578ee9](e578ee9)) * dropping of expired samples ([#19](#19)) ([f2d17ed](f2d17ed)) * end of line from CRLF to LF of some files ([b552e01](b552e01)) * Fix the billing info generated for the GCP marketplace ([9e023b0](9e023b0)) * follow proxy settings defined in env vars ([83db7ba](83db7ba)) * HostInfo not loaded in deamon mode ([e40c017](e40c017)) * Introduce the INIConfig type ([15453bf](15453bf)) * Introduce the MultiError structure ([40548cd](40548cd)) * Log the configuration messages by default ([6412a96](6412a96)) * logging improvements for easier debugging ([#18](#18)) ([286b1e6](286b1e6)) * Move billing info to a new structure ([8e24825](8e24825)) * move start of timers after initial notify ([0697014](0697014)) * Prepare the metrics log for the introduction of checkpoints ([e94166b](e94166b)) * prevent overlapping remote writes ([0de885a](0de885a)) * Print debug messages by default without further configuration ([89fde19](89fde19)) * Rename the cpu cache to the metrics log ([1d7bbf0](1d7bbf0)) * return error when out of retries ([37d8670](37d8670)) * selinux denials - dbus send msgs, squid port connect ([d440e73](d440e73)) * Simplify loading of the host info from facts ([32ce8d5](32ce8d5)) * stop/restart on rpm uninstall/update, remove selinux policy on rpm uninstall ([#21](#21)) ([7e025d1](7e025d1)) * The `log_path` configuration attribute should set the log path ([46c20e7](46c20e7)) * throw away samples on Prometheus Remote Write 4xx errors ([#15](#15)) ([f92a91b](f92a91b)) * tune http Transport config ([bfb5b87](bfb5b87)) * Unify parsing of uint configuration values ([b511bc4](b511bc4)) * Unify the configuration of the host certificates ([05be176](05be176)) * Unify the execution of subscription-manager ([fff1554](fff1554)) * Unify the processing of the subscription-manager outputs ([9901c6d](9901c6d)) * Use the subscription manager to get the host id ([52011f3](52011f3)) ### Features * add systemd service unit file ([d1123a0](d1123a0)) * add unit suffix to interval configuration variables ([63de6c8](63de6c8)) * cache for CPU timeseries ([2219dca](2219dca)) * client daemon PoC ([7e4def3](7e4def3)) * collect metrics/labels and notify right after daemon start ([fd50d7c](fd50d7c)) * **config:** loading from config file ([061de33](061de33)) * **config:** loading from environmetal variables ([0cc9d32](0cc9d32)) * configurable Prometheus remote write timeout ([5fd555e](5fd555e)) * Ensure exclusive access to the metrics log ([6cdd6f4](6cdd6f4)) * filter out samples older than maxAge ([743a349](743a349)) * **HostInfo:** load information via subscription-manager ([88ce8f4](88ce8f4)) * make CPU cache path configurable ([4dc8823](4dc8823)) * metrics_max_age_sec config value ([4cd9102](4cd9102)) * monitor and react to subscription changes ([82473d3](82473d3)) * **notify:** recreate http client on host info change ([124c7f4](124c7f4)) * print Host Info after it is reloaded ([30338e9](30338e9)) * **prometheus:** incremental back-off ([1c3c2b3](1c3c2b3)) * remove CLI options ([fa8a0a7](fa8a0a7)) * reuse http client for notification ([eb49b5e](eb49b5e)) * selinux policy ([e460539](e460539)) * send conversions_success label when host was converted ([#16](#16)) ([fe8b644](fe8b644)) * stop and re-run deamon ([09047be](09047be)) * use cpu cache ([eef9d25](eef9d25)) * Use std logging library ([3ce70eb](3ce70eb)) * **write:** include extended HostInfo as labels ([74a9505](74a9505))
🎉 This PR is included in version 1.0.0 🎉 The release is available on:
Your semantic-release bot 📦🚀 |
# 1.0.0 (2023-11-29) ### Bug Fixes * add external_organization label ([92489c2](92489c2)) * change prometheus remote write url ([f2131db](f2131db)) * Count the number of vCPUs instead of the number of CPU cores ([7af0f82](7af0f82)) * crash on failed cert watcher creation ([b514d48](b514d48)) * crash when host cert is not available ([e48c1de](e48c1de)) * do not try to notify if required data are missing ([RedHatInsights#14](https://github.com/pvoborni/host-metering/issues/14)) ([8620adf](8620adf)) * don't print success on http errors ([1e3d917](1e3d917)) * Don't special-case first index in the metrics log ([4311117](4311117)) * double call of `subscription-manager identity` ([RedHatInsights#17](https://github.com/pvoborni/host-metering/issues/17)) ([e578ee9](e578ee9)) * dropping of expired samples ([RedHatInsights#19](https://github.com/pvoborni/host-metering/issues/19)) ([f2d17ed](f2d17ed)) * end of line from CRLF to LF of some files ([b552e01](b552e01)) * Fix the billing info generated for the GCP marketplace ([9e023b0](9e023b0)) * follow proxy settings defined in env vars ([83db7ba](83db7ba)) * HostInfo not loaded in deamon mode ([e40c017](e40c017)) * Introduce the INIConfig type ([15453bf](15453bf)) * Introduce the MultiError structure ([40548cd](40548cd)) * Log the configuration messages by default ([6412a96](6412a96)) * logging improvements for easier debugging ([RedHatInsights#18](https://github.com/pvoborni/host-metering/issues/18)) ([286b1e6](286b1e6)) * Move billing info to a new structure ([8e24825](8e24825)) * move start of timers after initial notify ([0697014](0697014)) * Prepare the metrics log for the introduction of checkpoints ([e94166b](e94166b)) * prevent overlapping remote writes ([0de885a](0de885a)) * Print debug messages by default without further configuration ([89fde19](89fde19)) * Rename the cpu cache to the metrics log ([1d7bbf0](1d7bbf0)) * return error when out of retries ([37d8670](37d8670)) * selinux denials - dbus send msgs, squid port connect ([d440e73](d440e73)) * Simplify loading of the host info from facts ([32ce8d5](32ce8d5)) * stop/restart on rpm uninstall/update, remove selinux policy on rpm uninstall ([RedHatInsights#21](https://github.com/pvoborni/host-metering/issues/21)) ([7e025d1](7e025d1)) * The `log_path` configuration attribute should set the log path ([46c20e7](46c20e7)) * throw away samples on Prometheus Remote Write 4xx errors ([RedHatInsights#15](https://github.com/pvoborni/host-metering/issues/15)) ([f92a91b](f92a91b)) * tune http Transport config ([bfb5b87](bfb5b87)) * Unify parsing of uint configuration values ([b511bc4](b511bc4)) * Unify the configuration of the host certificates ([05be176](05be176)) * Unify the execution of subscription-manager ([fff1554](fff1554)) * Unify the processing of the subscription-manager outputs ([9901c6d](9901c6d)) * Use the subscription manager to get the host id ([52011f3](52011f3)) ### Features * add label refresh on label_refresh_interval ([6cf0df3](6cf0df3)) * add systemd service unit file ([d1123a0](d1123a0)) * add unit suffix to interval configuration variables ([63de6c8](63de6c8)) * cache for CPU timeseries ([2219dca](2219dca)) * capture and send installed products ID ([RedHatInsights#25](https://github.com/pvoborni/host-metering/issues/25)) ([5f5e552](5f5e552)) * client daemon PoC ([7e4def3](7e4def3)) * collect metrics/labels and notify right after daemon start ([fd50d7c](fd50d7c)) * **config:** loading from config file ([061de33](061de33)) * **config:** loading from environmetal variables ([0cc9d32](0cc9d32)) * configurable log prefix ([fcb8555](fcb8555)) * configurable Prometheus remote write timeout ([5fd555e](5fd555e)) * document go proxy envs in man page ([773a12f](773a12f)) * Ensure exclusive access to the metrics log ([6cdd6f4](6cdd6f4)) * filter out samples older than maxAge ([743a349](743a349)) * **HostInfo:** load information via subscription-manager ([88ce8f4](88ce8f4)) * make CPU cache path configurable ([4dc8823](4dc8823)) * metrics_max_age_sec config value ([4cd9102](4cd9102)) * monitor and react to subscription changes ([82473d3](82473d3)) * **notify:** recreate http client on host info change ([124c7f4](124c7f4)) * print Host Info after it is reloaded ([30338e9](30338e9)) * **prometheus:** incremental back-off ([1c3c2b3](1c3c2b3)) * remove CLI options ([fa8a0a7](fa8a0a7)) * reuse http client for notification ([eb49b5e](eb49b5e)) * selinux policy ([e460539](e460539)) * send conversions_success label when host was converted ([RedHatInsights#16](https://github.com/pvoborni/host-metering/issues/16)) ([fe8b644](fe8b644)) * shorten go proxy env description ([728e497](728e497)) * stop and re-run deamon ([09047be](09047be)) * use cpu cache ([eef9d25](eef9d25)) * Use std logging library ([3ce70eb](3ce70eb)) * **write:** include extended HostInfo as labels ([74a9505](74a9505))
This change adds concept of Recoverable and Non-Recoverable error in Notifier interface. Deamon throws away the samples on non-recoverable error.
PrometheusNotifier was modified to produce Non-recoverable error on 4xx http error family and on unknown http errors. This should be in complience with prometheus remote write spec as the client should not retry on 4xx except on 429.
This also means that host-metering will throw away samples from time when
It doesn't throw away samples on connectivity issues (no response from server).