-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Intermittent replication errors when running ipa-healthcheck #283
Comments
This check is provided by 389 itself. I suppose we could consider reducing the severity to WARNING but I'd leave that as a call to them. @mreynolds389 what do you think? |
Well it is a transient error. Replication is just busy at that time. If you run it again in a few seconds it will probably pass. For us we already set it to a "medium" severity. |
Thanks both for replying! Yes it's a transient error. We run ipahealthcheck_exporter which basically scrapes ipa-healthcheck logs every 5 minutes. Can you suggest an alternative way of verifying replication health? @mreynolds389 you mentioned you set it to "medium" severity, could I ask how? |
Well IPA is using DS's lib389 library for the DS healthchecks. IPA does not use DS's healthecheck severity level - it is ignored because there are basically two tools that were merged. |
@rcritten Since IPA does not use DS's healthcheck severity level could this checks severity level be lowered to WARNING in IPA? |
healthcheck doesn't ignore the DS severity. It converts it. See #283 (comment) "medium" from DS is converted into a ipa-healthcheck ERROR severity. |
Thanks for clarifying. Do we want to set this specific check's severity to WARNING bypassing the conversion? As mentioned it is a transient error but it is still triggering a ERROR severity. |
I suppose it's possible but it would be an ugly one-off. healthcheck has a rather thin wrapper to call the 389 checks and then re-format the return value. It's very generic code. It would be invasive to put in a test for a specific check. |
I looked at the code and would assume as much and I tend to agree. Currently we exclude this specific check since we can't really "trust" the ERROR trigger. |
Issue
Intermittent replication errors when running ipa-healthcheck.
Running ipa-healthcheck every x minutes provides unreliable ReplicationChecks results.
From what I've read on https://access.redhat.com/solutions/359683, getting a "replica is busy" is considered "normal".
This make it difficult to monitor for actual replication errors.
Actual behaviour
Similar to the above error can happen intermittently on every freeipa server on a 3 node cluster.
There aren't any replication errors most of the time.
Expected behavior
It should not report an error.
A warning would be more suitable.
Version/Release/Distribution
The text was updated successfully, but these errors were encountered: