Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cluster Infrastructure #24

Open
24 tasks
webwurst opened this issue Jul 20, 2019 · 0 comments
Open
24 tasks

Cluster Infrastructure #24

webwurst opened this issue Jul 20, 2019 · 0 comments

Comments

@webwurst
Copy link
Member

webwurst commented Jul 20, 2019

FIXME Link here to doc about our current Kubernetes cluster and hosting setup.

  • Monitoring
    • Add customized dashboards using Grafonnet to kube-prometheus.
      • cert-manager
      • openebs
    • Configure Alerts and send notifications to a matrix-channel.
    • Add website analytics with Fathom.
    • Create public status page with overview of current apps.
    • Regularly check observatory.mozilla.org for all public sites.
  • Authentication
    • OpenID Connect via Keycloak for kube-apiserver and apps.
    • Add gangway.
  • Security
  • Shared services
    • Kinto
    • Postgres
    • Minio
    • Elasticsearch
  • Backup
    • Push database snapshots and filestores regularly so some s3 storage.
  • Stability
    • Automatically replace the oldest node every twelve hours with a fresh one. Maybe with the help of kured.
    • Make sure limits are set with every pod.
    • Make every service be backed by at least two replicas. Label apps that can't deal with this.
    • Set PodDisruptionBudget for all apps.
    • Set recommended labels for all resources.

Random Ideas

  • Try varnish with traffics and crashes.
  • Add blackbox exporter for our public services.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Epics
Development

No branches or pull requests

1 participant