Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: add missing pmie webhook action configuration functionality #183

Merged
merged 6 commits into from
Dec 12, 2023

Conversation

richm
Copy link
Collaborator

@richm richm commented Dec 11, 2023

Resolves Red Hat issue RHEL-13760

@richm
Copy link
Collaborator Author

richm commented Dec 11, 2023

[citest]

@richm richm force-pushed the debug-test-failures branch from 086edc6 to acf9cef Compare December 11, 2023 23:59
@richm
Copy link
Collaborator Author

richm commented Dec 12, 2023

[citest]

@richm
Copy link
Collaborator Author

richm commented Dec 12, 2023

@natoscott Some findings:

RHEL7 pmlogger failure - https://dl.fedoraproject.org/pub/alt/linuxsystemroles/logs/lsr-citool_metrics-183-99586c4_RHEL-7.9-updates-20231114.0_20231212-005055/artifacts/tests_default-FAILED.log

Dec 12 00:42:24 b27db37e-308c-4008-8c89-b87ac5be8267.testing-farm pmlogger[22435]: Starting pmlogger ...
Dec 12 00:42:24 b27db37e-308c-4008-8c89-b87ac5be8267.testing-farm systemd[1]: Can't open PID file /run/pcp/pmlogger.pid (yet?) after start: No such file or directory
Dec 12 00:42:46 b27db37e-308c-4008-8c89-b87ac5be8267.testing-farm root[23827]: pmlogger_daily failed - see /var/log/pcp/pmlogger/pmlogger_daily-K.log
Dec 12 00:42:46 b27db37e-308c-4008-8c89-b87ac5be8267.testing-farm systemd[1]: Daemon never wrote its PID file. Failing.
Dec 12 00:42:46 b27db37e-308c-4008-8c89-b87ac5be8267.testing-farm systemd[1]: Failed to start Performance Metrics Archive Logger.

not sure, but it looks as though it cannot write the pid file /run/pcp/pmlogger.pid - not sure why - does this remind you of anything? If not, I'll keep looking - any log files under /var/log that might contain more information? I'll note that there are no AVCs in this case.

RHEL9 grafana failure - https://dl.fedoraproject.org/pub/alt/linuxsystemroles/logs/lsr-citool_metrics-183-99586c4_RHEL-9.4.0-20231209.15_20231212-005612/artifacts/tests_bz1855544-FAILED.log - looks like selinux AVC:

type=AVC msg=audit(1702342060.402:2116): avc:  denied  { getattr } for  pid=40738 comm="grafana-server" path="/efi" dev="xvda2" ino=1 scontext=system_u:system_r:grafana_t:s0 tcontext=system_u:object_r:dosfs_t:s0 tclass=dir permissive=0
type=AVC msg=audit(1702342060.848:2136): avc:  denied  { getattr } for  pid=40753 comm="grafana-server" path="/efi" dev="xvda2" ino=1 scontext=system_u:system_r:grafana_t:s0 tcontext=system_u:object_r:dosfs_t:s0 tclass=dir permissive=0

We could add a sefcontext for this using the selinux system role - wdyt?

Looks like the other redis service error was fixed by your patch.

@natoscott
Copy link
Collaborator

@natoscott Some findings:

RHEL7 pmlogger failure - https://dl.fedoraproject.org/pub/alt/linuxsystemroles/logs/lsr-citool_metrics-183-99586c4_RHEL-7.9-updates-20231114.0_20231212-005055/artifacts/tests_default-FAILED.log

Dec 12 00:42:24 b27db37e-308c-4008-8c89-b87ac5be8267.testing-farm pmlogger[22435]: Starting pmlogger ...
Dec 12 00:42:24 b27db37e-308c-4008-8c89-b87ac5be8267.testing-farm systemd[1]: Can't open PID file /run/pcp/pmlogger.pid (yet?) after start: No such file or directory
Dec 12 00:42:46 b27db37e-308c-4008-8c89-b87ac5be8267.testing-farm root[23827]: pmlogger_daily failed - see /var/log/pcp/pmlogger/pmlogger_daily-K.log
Dec 12 00:42:46 b27db37e-308c-4008-8c89-b87ac5be8267.testing-farm systemd[1]: Daemon never wrote its PID file. Failing.
Dec 12 00:42:46 b27db37e-308c-4008-8c89-b87ac5be8267.testing-farm systemd[1]: Failed to start Performance Metrics Archive Logger.

not sure, but it looks as though it cannot write the pid file /run/pcp/pmlogger.pid - not sure why - does this remind you of anything? If not, I'll keep looking - any log files under /var/log that might contain more information? I'll note that there are no AVCs in this case.

Hmm, there were a number of issues relating to interactions with systemd resolved in RHEL-8 - this could be related to one of those. But nothing specific comes to mind. If we cannot figure it out from those additional logs, we could just not-run this particular test on that platform I guess (there's no open customer issues there).

RHEL9 grafana failure - https://dl.fedoraproject.org/pub/alt/linuxsystemroles/logs/lsr-citool_metrics-183-99586c4_RHEL-9.4.0-20231209.15_20231212-005612/artifacts/tests_bz1855544-FAILED.log - looks like selinux AVC:

type=AVC msg=audit(1702342060.402:2116): avc:  denied  { getattr } for  pid=40738 comm="grafana-server" path="/efi" dev="xvda2" ino=1 scontext=system_u:system_r:grafana_t:s0 tcontext=system_u:object_r:dosfs_t:s0 tclass=dir permissive=0
type=AVC msg=audit(1702342060.848:2136): avc:  denied  { getattr } for  pid=40753 comm="grafana-server" path="/efi" dev="xvda2" ino=1 scontext=system_u:system_r:grafana_t:s0 tcontext=system_u:object_r:dosfs_t:s0 tclass=dir permissive=0

We could add a sefcontext for this using the selinux system role - wdyt?

Temporarily I guess - we need to fix this in Grafana, the crew are looking into it & will hopefully get a new build through this week.

Looks like the other redis service error was fixed by your patch.

Ah excellent, thanks for pushing that through.

@richm
Copy link
Collaborator Author

richm commented Dec 12, 2023

[citest]

@richm
Copy link
Collaborator Author

richm commented Dec 12, 2023

@natoscott . . . and of course, now that I added more debugging for the el7 failure, it doesn't fail :-(
still - I think we should merge this PR - it will definitely help if we see these failures downstream, and it seems to fix some of the other failures

for RHEL 9.4 - if the grafana selinux issue is known and being worked on, I don't think we need to add a sefcontext in the metrics role

@richm richm marked this pull request as ready for review December 12, 2023 15:23
@richm richm requested a review from natoscott as a code owner December 12, 2023 15:23
Copy link
Collaborator

@natoscott natoscott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @richm

@richm richm changed the title fix: fix various test failures fix: add missing pmie webhook action configuration functionality Dec 12, 2023
@richm richm merged commit 7705c31 into linux-system-roles:main Dec 12, 2023
23 of 26 checks passed
@richm richm deleted the debug-test-failures branch December 12, 2023 21:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants