This guide is intended for system administrators and operations engineers who are responsible for maintaining a Sourcegraph Kubernetes cluster. Each section covers a topic or tool that may be helpful in managing the cluster.
The following commands are useful to gain visibility into cluster status.
List all pods running | kubectl get pods -o=wide |
Describe pod state, including reasons why a pod is not successfully running. | kubectl describe pod $POD_NAME |
Tail logs | kubectl logs -f $POD_NAME |
SSH into a running pod container. | kubectl exec -it $POD_NAME -- sh |
Get a PostgreSQL client on the prod database. | kubectl exec -it $(kubectl get pods -l app=pgsql -o jsonpath="{.items[0].metadata.name}") -- psql -U sg |
Prometheus is an open-source application monitoring system and time series database. It is commonly used to track key performance metrics over time, such as the following:
- QPS
- Application requests by URL route name
- HTTP response latency
- HTTP error codes
- Time since last search index update
Follow the steps to deploy Prometheus.
After updating the cluster, the running Prometheus pod will be visible in the list printed by
kubectl get pods
. Once this is enabled, Prometheus will begin recording performance metrics across
all services running in Sourcegraph.
Distributed tracing tools are useful when debugging performance issues such as high query latency. Sourcegraph uses the OpenTracing standard and can be made to work with any tracing tool that satisfies that standard. Currently, two tracing tools are supported by Sourcegraph configuration:
The sourcegraph-server-gen
command supports creating and restoring snapshots of the database,
which can be useful for backups and syncing database state from one cluster to another:
- On macOS:
curl -O https://storage.googleapis.com/sourcegraph-assets/sourcegraph-server-gen/darwin_amd64/sourcegraph-server-gen chmod +x ./sourcegraph-server-gen
- On Linux:
curl -O https://storage.googleapis.com/sourcegraph-assets/sourcegraph-server-gen/linux_amd64/sourcegraph-server-gen chmod +x ./sourcegraph-server-gen
Run sourcegraph-server-gen snapshot --help
for more information.