-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
System scope tasks seem to have problems when XDG_RUNTIME_DIR is set #72
Comments
I've done some more experimentation, removing the setting of XDG_RUNTIME_DIR from the lsr.systemd role. I'm a little surprised to see the polkit authentication; I don't see that in interactive sessions, so I don't understand what's happening. I suspect my IPA domain might have something to do with it too, but not certain there either. |
What version of the role are you using? What is the platform and version of the managed node? What version of ansible are you using? |
Thanks for responding!
|
I tried again with a freshly installed system (i.e. without all my "local customizations", stock Fedora 41 netinst install, upgraded to the lastest as of today and no special auth configuration, with a similar result (localadmin is the starting user with the ability to sudo by virtue of being in the wheel group), got a similar result:
|
I was able to run this successfully (against an f41 node and a centos-10 stream node) with ansible_user=root (as opposed to going in as another user and elevating), using an entry in authorized_keys. Here's the output for the reload task at -vvv:
|
I have spent some more time with this, and I see that the current galaxy linux-system-roles seems to lag the latest version of this repo by a bit. I have gotten the things I want to get working, working, by making the following changes (to the lsr version, but I think the same things will apply here):
My reasoning for this is that it is conceivable that you would connect as a non-root user, and want to manage both non-root units and root units in the same play. By checking user_id you can see what you've connected as. (Though I think there might be a problem here if you're connected as a non-privleged user, expect to escalate to root to manage system units, but become comes out as "false" because the user in the dict is root but you're not necessarily connected as root). I wonder if it would be OK to always set become_user; it would be a proper error I think if the ansible user is unable to become any of the users that want to manage and in general I think users can become themselves. I will work on submitting a proper PR. My hope is that this doesn't screw anything up for older versions of RHEL that we have to support. |
It should not, at least for "real" fixes and features. There are a lot of commits related to testing, ci, etc. that might not be in the published role or collection, but all fixes and features in the Galaxy published code should be up-to-date.
|
Thanks for looking at this! Yeah, the differences were not enormous. Mostly formatting and a couple of things like that. The comments I made here predate the PR. As an update: the PR sets I was able to test this on F41, on CentOS 9-stream, and on a fresh almalinux 8 in my homelab. I am unaware of other potential pitfalls. I think generally:
There seem to be a number of potential gotchas here, though, so I completely understand caution. :) |
The design philosophy for the system roles is:
The proposed PR goes against that, but from what you have reported, it seems like the right thing to do. Note that I borrowed this implementation from the podman system role quadlet support, and that has been used extensively to manage user quadlets, so I'm surprised we haven't seen this issue there. [1] I realize that there are ways around that for determined/savvy users, but that is far from typical |
I ran into some problems with synthesized host records in systemd-resolved, and was using this collection to inject dropins.
Using the latest version, I see consistent hangs at this point:
These things seem not to happen when XDG_RUNTIME_DIR is not set. (I'm not sure that's the problem, but it kind of looks like that).
Here are the playbook and template I was running: (the unit reload at the end works fine without the extra env var setting):
And the template:
The text was updated successfully, but these errors were encountered: