-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ported monitoring stack to k3s #449
base: main
Are you sure you want to change the base?
Conversation
no image changes since last build so last commit should be ready to merge barring review changes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please note in the PR (with better wording!) that
- the "prometheus" group is bascally short for "kube-prometheus-stack" group!
- the monitoring link in CaaS now accesses grafana with anonymous auth (b/c it has to go via OOD), so CaaS users can't change their dashboards
Co-authored-by: Steve Brasier <[email protected]>
…slurm-appliance into feature/k3s-monitoring
@wtripp180901 not a high priority but would be nice to know if this PR reduces the size of the data in the image. And/or whether we can reduce the required root disk size at all - which isn't the same thing, b/c e.g. dnf caches which we throw away require additional size during build. I think you'd need qemu-img info to see the former. And monitoring disk usage during build to see the latter. |
Monitoring stack (prometheus/node exporter/grafana/alertmanager) binary installs removed from site and fatimage, now installs kube-prometheus-stack Helm chart into k3s cluster during site run. Containers are pre-pulled by podman and exported into k3s during fatimage build.
As a consequence, the
grafana
,alertmanager
andnode exporter
groups have been removed and associated roles are now all managed by theprometheus
role, which is short for kube_prometheus_stackAlso reduced metrics collected by node exporter down to minimal set described in
docs/monitoring-and-logging.README.md
, which was previously unimplementedNote that because of how OOD's proxying interacts with Grafana's server config and kubernetes, OOD being enabled means that Grafana is only accessible through the OOD proxy. In the caas environment, this means that accessing Grafana requires authenticating with OOD's basic auth. Therefore, accessing Grafana through caas no longer logs you in as the admin user, you instead access the dashboards anonymously
Tests as of 8ca0407: