Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prometheus configuration auto-reloading #355

Merged
merged 6 commits into from
Jun 10, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -11,3 +11,12 @@ groups:
annotations:
summary: "Instance {{ $labels.instance }} down"
description: "{{ $labels.instance }} of job {{ $labels.job }} has been down for more than 5 minutes."

- alert: PrometheusConfigFailed
expr: prometheus_config_last_reload_successful == 0
for: 0m
labels:
severity: page
annotations:
summary: "Prometheus config reload in pod {{ $labels.kubernetes_pod_name }} has failed"
description: "Prometheus instance {{ $labels.kubernetes_pod_name }} (`{{ $labels.instance }}`) has failed to reload its config."
37 changes: 37 additions & 0 deletions kubernetes/namespaces/monitoring/prometheus/deployment.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,9 @@ spec:
metadata:
labels:
app: prometheus
annotations:
prometheus.io/scrape: "true"
prometheus.io/port: prometheus
spec:
serviceAccountName: prometheus
containers:
Expand Down Expand Up @@ -41,6 +44,32 @@ spec:
mountPath: /etc/prometheus
- name: prometheus-alerts
mountPath: /opt/pydis/prometheus/alerts.d
- image: ghcr.io/owl-corp/inotify-base:latest
imagePullPolicy: Always
name: prometheus-reloader
securityContext:
readOnlyRootFilesystem: true
volumeMounts:
- name: prometheus-config
mountPath: /opt/monitor/prom-config
- name: prometheus-alerts
mountPath: /opt/monitor/prom-alerts
- name: reloader-hook
mountPath: /opt/pydis
- name: reloader-tmpfs
mountPath: /tmp
env:
- name: INOTIFY_HOOK_SCRIPT
value: /opt/pydis/hook.sh
# When a ConfigMap volume updates we see a delete event for the old
# container timestamp
- name: INOTIFY_WATCH_EVENTS
value: delete
- name: INOTIFY_HOOK_DELAY
value: "5"
envFrom:
- secretRef:
name: prometheus-reloader-env
restartPolicy: Always
securityContext:
fsGroup: 2000
Expand All @@ -56,3 +85,11 @@ spec:
- name: prometheus-alerts
configMap:
name: prometheus-alert-rules
- name: reloader-hook
configMap:
name: prometheus-reloader-script
defaultMode: 0777
- name: reloader-tmpfs
emptyDir:
medium: Memory
jchristgit marked this conversation as resolved.
Show resolved Hide resolved
sizeLimit: 50Mi
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
apiVersion: v1
kind: ConfigMap
metadata:
name: prometheus-reloader-script
namespace: monitoring
data:
hook.sh: |-
#!/bin/sh

set -exo pipefail

# Endpoint to call to reload Prometheus
RELOAD_URL="http://localhost:9090/-/reload"
# Icon for the webhook
PROMETHEUS_ICON_URL="https://static-00.iconduck.com/assets.00/prometheus-icon-511x512-1vmxbcxr.png"

echo "Detected change in mounted configmaps, reloading Prometheus..."

# Make a temporary store to keep any errors
RESPONSE_STORE="$(mktemp)"

# Attempt the reload, writing the response to the tempfile and the reload HTTP
# code to the variable
RELOAD_RESULT="$(curl -o "$RESPONSE_STORE" -X POST $RELOAD_URL -s -w "%{http_code}")"

# Parse and filter the response body into a JSON string
RESPONSE_CONTENT="$(cat "$RESPONSE_STORE")"
FILTERED_BODY="$(echo "$RESPONSE_CONTENT" | jq -Rsa)"

# Send a notification based on pass/failure
if [ $RELOAD_RESULT -eq 200 ]; then
BODY='{"username": "Prometheus Reloader", "embeds": [{ "title": "Prometheus Config Reload Succeeded", "description": "No errors.", "color": 6663286 } ], "avatar_url": "'"$PROMETHEUS_ICON_URL"'" }'
else
BODY='{"username": "Prometheus Reloader", "embeds": [{ "title": "Prometheus Config Reload Failed", "description": '"$FILTERED_BODY"', "color": 12799052 } ], "avatar_url": "'"$PROMETHEUS_ICON_URL"'" }'
fi;

# Send the webhook
curl -X POST -H "Content-Type: application/json" "$RELOADER_DISCORD_HOOK" -d "$BODY"
Binary file not shown.
Loading