Skip to content

Weekly routine Issue #139

Weekly routine Issue

Weekly routine Issue #139

name: Weekly routine Issue
on:
schedule:
- cron: "0 8 * * 1"
jobs:
job:
runs-on: ubuntu-24.04
steps:
- name: Create routine issue
shell: python
run: |
import datetime
import os
import re
import sys
import requests
payload = {
"title": f"Week {datetime.datetime.now().strftime('%W %Y')} routine",
"body": os.getenv("ISSUE_BODY"),
"labels": re.sub(r"\s", "", os.getenv("LABELS", "")).split(",") or None,
"assignees": re.sub(r"\s", "", os.getenv("ASSIGNEES", "")).split(",") or None,
}
api_url = f"https://api.github.com/repos/{os.getenv('REPO')}/issues"
url = f"https://github.com/{os.getenv('REPO')}/issues"
headers = {
"Accept": "application/vnd.github.v3+json",
"Authorization": f"token {os.getenv('ACCESS_TOKEN')}",
}
resp = requests.get(api_url, headers=headers)
if resp.status_code != 200:
print(f"❌ Couldn't retrieve issues for {url} using {api_url}.")
print(f"HTTP {resp.status_code} {resp.reason} - {resp.text}")
print("Check your `ACCESS_TOKEN` secret.")
sys.exit(1)
resp = requests.post(api_url, headers=headers, json=payload)
if resp.status_code != 201:
print(f"❌ Couldn't create issue for {url}")
print(f"HTTP {resp.status_code} {resp.reason} - {resp.text}")
sys.exit(1)
print(f"✅ Issue successfully created at {url}/{resp.json().get('number')}")
sys.exit(0)
env:
ACCESS_TOKEN: ${{ secrets.ACCESS_TOKEN }}
REPO: kiwix/k8s
LABELS: maint
ASSIGNEES: rgaudin,benoit74
ISSUE_BODY: |
## Check nodes free space
```sh
df -h / && df -h /data
```
- [ ] create a report in issue comment
## Nodes system upgrades
```sh
apt update && apt upgrade
```
- [ ] run systematically the upgrade on bastion, stats, services, storage, demo, mirrors-qa nodes
- [ ] check for and apply important security upgrade on worker nodes asap (imager-worker, ondemand, sisyphus)
(regular workers updates are done separately on a monthly basis for worker nodes to not impact production)
## Backups
- [ ] Ensure [all borg repositories](https://www.borgbase.com/) are being updated
## k8s cluster
- [ ] Check Pod errors or in CrashLoopBackoff
```sh
k get pods -A -o wide|grep -E 'Error|Crash'
```
- [ ] Check Pod restarts
```sh
k get pods -A -o wide | pyp -i 'print("\n".join([line for line in l if re.split(r"\s+", line)[4] != "0"]))'
```
- [ ] Check if k8s should/could be upgraded
```sh
curl -s -H "X-Auth-Token: $SCW_SECRET_KEY" https://api.scaleway.com/k8s/v1/regions/fr-par/clusters/$KIWIX_PROD_CLUSTER | jq ".version,.upgrade_available"
curl -s -H "X-Auth-Token: $SCW_SECRET_KEY" https://api.scaleway.com/k8s/v1/regions/fr-par/versions | jq ".versions[].name"
```
- [ ] [Upgrade k8s](https://github.com/kiwix/k8s/wiki/Cluster-Setup#upgrading-kubernetes) if applicable/possible
## Stats
[matomo - stats.kiwix.org](https://stats.kiwix.org)
- [ ] Ensure download.kiwix.org stats are being recorded
- [ ] Check whether matomo should be upgraded
## Grafana
- [ ] Alert list is [normal](https://kiwixorg.grafana.net/alerting/list)
- [ ] Zimfarm dashboard is [normal](https://kiwixorg.grafana.net/d/d2803d94-7c40-4338-bf80-f3cd7cd796bf/zimfarms?from=now-7d&to=now)
- [ ] Jobs durations dashboard is [normal](https://kiwixorg.grafana.net/d/bb0f0990-04c5-4314-8afc-6185ac49c668/jobs?orgId=1)
- [ ] There is no abnormal behaviors on [cluster resources consumption](https://kiwixorg.grafana.net/a/grafana-k8s-app/navigation/cluster/kiwix-prod)
- [ ] All [mdadm](https://kiwixorg.grafana.net/d/edu6v6ekri77kd/mdadm) RAID arrays are OK (check all instances in the top left combobox)
## Projects
- [ ] UptimeRobot [has no alert](https://dashboard.uptimerobot.com/monitors)
- [ ] [zimit backlog](https://farm.zimit.kiwix.org/pipeline/filter-todo) is reasonable
- [ ] [Cloud Code Signing Certificate](https://secure.ssl.com/certificate_orders/co-291j8smt34b) usage is *OK* (look for *Unused Signings* under *END ENTITY CERTIFICATES*): we have 1,200 per year (August to August)
- [ ] Analyze [zimit failed tasks](https://farm.zimit.kiwix.org/pipeline/filter-failed) and document bad domains in our [WIP blacklist](https://docs.google.com/spreadsheets/d/1mBjWT0hLmeg6EqT4nNEfCzLU8hGSzYs4IgbWDInhPqA/edit?gid=0#gid=0)
- [ ] [PRs awaiting your review](https://github.com/notifications?query=reason%3Areview-requested)
## Security
- [ ] Analyze/merge [dependabot PRs](https://github.com/notifications?query=author%3Adependabot[bot])
**Note**: this is an *automatic reminder* intended for the assignee(s).