-
Notifications
You must be signed in to change notification settings - Fork 160
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fence_sbd: if sbd devices are not specified with option, read SBD_DEVICE #591
base: main
Are you sure you want to change the base?
fence_sbd: if sbd devices are not specified with option, read SBD_DEVICE #591
Conversation
Can one of the admins check and authorise this run please: https://ci.kronosnet.org/job/fence-agents/job/fence-agents-pipeline/job/PR-591/1/input |
Guess we have to carefully think if this behavior is desirable. Klaus |
|
You have to run |
Can one of the admins check and authorise this run please: https://ci.kronosnet.org/job/fence-agents/job/fence-agents-pipeline/job/PR-591/2/input |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, @liangxin1300 !
Guess we have to carefully think if this behavior is desirable. Since it is possible to have nodes that do support running the SBD daemon (e.g. proper watchdog-device) and others that don't I'd rather have the fence-agent behave the same way regardless on which node it is running.
It still makes sense to respect explicitly specified "devices" especially since there might be the cases where some of the cluster nodes are not running sbd daemon but can be a fencing executor. But OTOH IMO, it also makes sense to be possible to omit the parameter for the normal cases where all nodes are configured to be running sbd daemon. So that by falling back to sysconfig, it'll avoid unnecessary duplication or even inconsistency with what's configured in the sysconfig.
Can one of the admins check and authorise this run please: https://ci.kronosnet.org/job/fence-agents/job/fence-agents-pipeline/job/PR-591/3/input |
f542fab
to
4aff0dc
Compare
Can one of the admins check and authorise this run please: https://ci.kronosnet.org/job/fence-agents/job/fence-agents-pipeline/job/PR-591/4/input |
4aff0dc
to
f4222cb
Compare
Can one of the admins check and authorise this run please: https://ci.kronosnet.org/job/fence-agents/job/fence-agents-pipeline/job/PR-591/5/input |
f4222cb
to
bda9d27
Compare
Can one of the admins check and authorise this run please: https://ci.kronosnet.org/job/fence-agents/job/fence-agents-pipeline/job/PR-591/6/input |
Puh ... that is a while back but was there a reason to explicitly parse sbd-config? Not 100% sure but I guess systemd is executing it as shell-script (if not we would at least have consistent behavior) so we might do here as well. And shouldn't it even be in the environment of pacemaker and thus be inherited to the fence-agents? |
SBD is not an easy thing to set up. It involves setups of watchdog & block devices, on-disk metadata, sysconfig file, CIB configuration... When the other sbd fence agent was implemented, I believe it was exactly for the purpose of easing the setup and more importantly ensuring the consistency by reading already known configuration from sysconfig rather than unnecessarily repeating them in the CIB.
Unlikely so. It seems to do the parsing by itself:
The python module "configparser" could be a helper. Although it was originally meant for parsing ini files, but apparently it's used by us as well such as:
It'd probably be asking too much of pacemaker if it was purely for the purpose of passing on the environment variables to the fence agent? It might be more convincing if pacemaker itself should care about any of the sbd settings.
Sounds sensible.
It might arguably be a change of behavior but it wouldn't affect existing setups. I believe all of them have always had "devices" explicitly configured. I'd rather not unnecessarily complicate this. This agent handles fencing via disk-based sbd. It seems very strait-forward to me that it should try finding the device list from sysconfig if "devices" is not explicitly specified. Of course if SBD_DEVICE is not even configured in sysconfig, an error should be dropped.
Separate sbd resources configured for different nodes with specific "devices" or without it, together with location constraints should do? |
bda9d27
to
3afaf47
Compare
Can one of the admins check and authorise this run please: https://ci.kronosnet.org/job/fence-agents/job/fence-agents-pipeline/job/PR-591/7/input |
retest this please |
I guess that was a misunderstanding. I meant that the environment that is currently passed to the fence-agents by pacemaker should already contain the definitions from /etc/sysconfig/sbd. Didn't check though. |
Ah, you are right. Somehow I forgot pacemaker already retrieves sbd sysconfig. In that case, it'd be even better to just reference the environment variable SBD_DEVICE in here. |
3afaf47
to
794b26e
Compare
Can one of the admins check and authorise this run please: https://ci.kronosnet.org/job/fence-agents/job/fence-agents-pipeline/job/PR-591/9/input |
794b26e
to
744d534
Compare
Can one of the admins check and authorise this run please: https://ci.kronosnet.org/job/fence-agents/job/fence-agents-pipeline/job/PR-591/10/input |
The CI fail has nothing to do with the code, so no need to worry about it. |
Was thinking of a smart way to check for a locally running sbd daemon. |
But as long as there should be the use cases where a node is not running sbd daemon but could be an sbd fencing executor, either fence agent or sbd CLI should be able to handle hanging block devices by itself anyway such as ClusterLabs/sbd#148 ?
Checking with systemd might be more straight-forward but of course technically less generic and more of a dependency. OTOH doing it via sbd.pid would of course require logic in configure.ac dealing with |
Guess we should have checking for the daemon in this PR. @oalbrigt what would you prefer? |
Yeah. I guess it would make sense to check if sbd is configured and running in this case. |
from environment