-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix/issue 685 persisting silences in alertmanager #1069
base: main
Are you sure you want to change the base?
Fix/issue 685 persisting silences in alertmanager #1069
Conversation
GorginZ
commented
Jul 14, 2024
•
edited
Loading
edited
- partially fixes/workaround for Persisting silences in alertmanager #685 for users who self-install manifests (which is not supported installation of GMP)
- allow for provisioning of PVC for alertManager statefulset to persist silence status across restarts
This looked like very low-hanging fruit and the issue has some recent activity so thought I'd pick this up. I haven't exercised this in a cluster or written tests yet, just quickly put this in and looked at resulting templates for different values files which looked satisfactory - but will do proper testing when I have time this week. |
Thanks so much! This is a great change. This doesn't fully solve #685 since default-on managed users cannot edit and apply these manifests, but it works for those who self-install GMP. :) I'm just curious -- have you tried Kustomize as an alternative? You could even use it with Helm: https://trstringer.com/helm-kustomize/ The team did not expect users to use our Helm chart (yet). We're only using it for templating at the moment. I wonder what other flags users might want. For example, maybe someone would want a static ConfigMap as opposed to a PVC 🤔 Another idea the team had which we really liked was to introduce a new custom resource for silences (and so we could rebuild any temp dirs, or stuff it into a ConfigMap). Otherwise, thanks for the PR. It's great to see interest. |
Heya, I'm working with a client at the moment who uses GMP, but sadly we don't leverage the alertManager component so we aren't actually impacted by silences not being persisted ourselves - I would like us to use alertManager, but the appetite was to just make alerts with the native monitoring options that already exist and just leverage the metrics. I just noted the open issue because we had a funny slip the other day where a coworker accidentally deleted every CRD and I noticed prom operator does log errors (can't get it's configs), I was wondering if it should complain a little louder and was looking to see what issues were open (I did not open an issue and don't necessarily need to, like it's pretty easy to deduce what's wrong, but that's why I was browsing issues and saw this one and thought I'd start looking at it)
Yeah I did note the notices throughout the repo.
And did realize this shortly after picking it up and when I brought up a cluster last night to apply manifests and realized really all I can do is show that the compiled manifests are "fine" for the three different configurations people would do for self-installs...
Considering the relationship the manifests have with the installation on GKE clusters I can see why this would be an attractive approach. hmm. Happy to polish for at least the self-install case at least. Looks like you guys are fairly proactive here so I'd be keep on helping with other stuff too where I can. |
Wanted to apply templated manifests in a cluster this arvo to do some testing evidence, but running off main first for sanity ofc and hitting this #1013 I'll just like revert the readonly stuff introduced in #944 while I fiddle locally :) also
gotta get it up and running first, but I'll think further about these ideas. I initially like configMap idea most - probably more than having a PVC. And I may prefer it to CRDs just bc less objects/simplicity. |
Wow, thanks for helping here 💪🏽 I would love to bring that to completion one way or another. Are you somewhere e.g. on CNCF or Golang Slack to chat?
Wonder if you considered that? Essentially anyone can easily patch our manifests like you did in this repo. We could consider having this by default too, but is the default option considered useful in this form? Or we expect self-deploy users have to change it anyways? I also wonder if there is a room to work with Prometheus Operator and Alertmanager for some solution around ConfigMaps and CRDs. |
Hey @bwplotka I've just jumped on the CNCF slack and gave you a wave - which I should do ahead of KubeDay in Melbourne anyway! I have a little bit of spare time between roles next week so it's a good time to think about simple contributions before I start. |