Skip to content

Commit

Permalink
Merge pull request #1 from rqlite/develop
Browse files Browse the repository at this point in the history
Initial release of rqlite Helm chart
  • Loading branch information
jtackaberry authored Jan 3, 2024
2 parents a930b8f + 3eaac1b commit cedad68
Show file tree
Hide file tree
Showing 19 changed files with 1,612 additions and 2 deletions.
3 changes: 3 additions & 0 deletions .github/configs/cr.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
# Enable automatic generation of release notes using GitHubs release notes generator. #
# https://docs.github.com/en/repositories/releasing-projects-on-github/automatically-generated-release-notes
generate-release-notes: true
38 changes: 38 additions & 0 deletions .github/workflows/release.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
name: Release Charts

on:
push:
branches:
- master
# Allows manual execution of jobs for testing
workflow_dispatch:

jobs:
release:
permissions:
contents: write
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@b4ffde65f46336ab88eb53be808477a3936bae11 # v4.1.1
with:
fetch-depth: 0

- name: Configure Git
run: |
git config user.name "$GITHUB_ACTOR"
git config user.email "[email protected]"
- name: Install Helm
uses: azure/setup-helm@5119fcb9089d432beecbf79bb2c7915207344b78 # v3.5
with:
version: v3.13.0

- name: Run chart-releaser
uses: helm/chart-releaser-action@a917fd15b20e8b64b94d9158ad54cd6345335584 # v1.6.0
with:
config: "./.github/configs/cr.yaml"
charts_dir: charts
skip_existing: true
env:
CR_TOKEN: "${{ secrets.GITHUB_TOKEN }}"
30 changes: 28 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,2 +1,28 @@
# helm-charts
Helm charts for rqlite
# rqlite Helm Charts

[![License](https://img.shields.io/badge/License-MIT-blue.svg)](https://github.com/rqlite/helm-charts/blob/master/LICENSE)
[![Release Status](https://github.com/rqlite/helm-charts/workflows/Release%20Charts/badge.svg)](https://github.com/rqlite/helm-charts/actions)

A repository of Helm charts for the rqlite project.


## Usage

[Helm](https://helm.sh) must be installed to use the charts.
Please refer to Helm's [documentation](https://helm.sh/docs/) to get started.

Once Helm is installed, add the repo:

```sh
helm repo add rqlite https://rqlite.github.io/helm-charts
```

Currently this repo only contains a single chart, for rqlite itself.

👉 See the [rqlite chart
documentation](https://github.com/rqlite/helm-charts/tree/develop/charts/rqlite).


## License

[MIT License](https://github.com/rqlite/helm-charts/blob/master/LICENSE).
13 changes: 13 additions & 0 deletions charts/rqlite/Chart.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
apiVersion: v2
name: rqlite
version: 1.0.0
appVersion: 8.14.1
description: The lightweight, distributed relational database built on SQLite
type: application
home: https://rqlite.io/
sources:
- https://github.com/rqlite/
- https://github.com/rqlite/rqlite
- https://github.com/rqlite/helm-charts
maintainers:
- name: jtackaberry
256 changes: 256 additions & 0 deletions charts/rqlite/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,256 @@
# rqlite Helm Chart

[rqlite](https://rqlite.io) is a lightweight, easy-to-use, distributed relational database
built on SQLite.

Experienced Helm user? Let's cut to the chase: the chart's default
[`values.yaml`](values.yaml) is what you want.


## Quick Start

First add the `rqlite` helm repository:

```bash
helm repo add rqlite https://rqlite.github.io/helm-charts
```

To install or upgrade a release named `rqlite` in a namespace called `db` run:

```bash
helm upgrade -i -n db --create-namespace rqlite rqlite/rqlite
```

This uses the default chart values, which deploys a single-node rqlite cluster on 10GiB
persistent volumes, without any authentication or TLS, accessible from within the
Kubernetes cluster at `http://rqlite.db.svc.cluster.local`.

If you want a 3 node cluster, add `--set replicaCount=3` at the end of the command. Or,
if you'd like to change the storage size to 50GiB, say, add `--set
persistence.size=50Gi`.

Naturally, once you have more than handful of customizations you will want to use a
separate values file. For example, `values.yaml` may contain:

```yaml
replicaCount: 3
persistence:
size: 50Gi
resources:
requests:
cpu: 500m
```
Then you deploy with:
```bash
helm upgrade -i -n db --create-namespace rqlite rqlite/rqlite -f values.yaml
```

Refer to [Helm's documentation](https://helm.sh/docs/) for more usage details for Helm itself.

Finally, read through the chart's default [`values.yaml`](values.yaml), which is well
commented and currently acts as the authoritative source of the chart's configuration
values.


## Production Deployments

The default chart values will deploy an unsecured single-node rqlite instace geared toward
low-friction testing, but it means that anyone with network access to the K8s Service or
pods has free rein over the rqlite database.

This may not be suitable for production deployments in your environment. It's recommended
you consider the following reliability and security related configuration:
* The number of replicas (`replicaCount`), which requires at least 3 for high availability
* Password-based authentication and user permissions (`config.users`)
* Client-facing TLS either by means of a TLS-terminating Ingress (`ingress.enabled`) or
by configuring rqlite's native TLS support (`config.tls.client`)
* Depending on your personal/organizational requirements or environmental constraints,
inter-node TLS (`config.tls.node`)
* Properly tuned requests and limits for your workload (`requests`)

It's also recommended you either pin to a specific Helm chart version (by passing
`--version` to `helm`) or at least to a specific rqlite version (`image.tag`), particularly
if using deployment pipelines, so that you have explicit control over when the software is
upgraded in your environment.


## Read-only Nodes

The chart supports deploying a separate resources for [readonly
nodes](https://rqlite.io/docs/clustering/read-only-nodes/), where:

* `readonly.replicaCount` specifies the number of read-only nodes for the rqlite cluster
* read-only nodes are given their own dedicated Kubernetes Service (and Ingress, if
enabled under `readonly.ingress`)
* remember to [use a read-consistency level of
`none`](https://rqlite.io/docs/clustering/read-only-nodes/#querying-a-read-only-node)
when querying the read-only endpoint, otherwise your queries will simply be
forwarded to the cluster's current leader node, defeating the purpose of a dedicated
read-only pool
* by default, readonly nodes inherit most of the same chart configuration values that
voting nodes use, but most configuration can be overriden specifically for readonly
nodes by specifying keys normally at the top-level within the `readonly` map
* all configuration that can be overriden within `readonly` is so indicated in
[`values.yaml`](values.yaml)
* unlike voting nodes, read-only nodes will automatically leave cluster when the pod
gracefully terminates, making it possible to use Horizontal Pod Autoscaling for
demand-based scaling for readonly workloads


## Secrets

The chart receives a number of configurable values that are sensitive, such as user
passwords and TLS private keys. As a best practice, it's recommended to define sensitive
values in an appropriately secured file, separate from non-secret values.

One solution is to use the popular
[helm-secrets](https://github.com/jkroepke/helm-secrets) plugin, which allows configuring
charts using [Sops](https://github.com/getsops/sops)-encrypted values files.

Non-secret configuration is then kept in `values.yaml` as usual, while you can keep, for
example, `config.users`, `config.tls.node.key`, etc. in `secrets.yaml` which is
Sops-encrypted. Then the chart can be deployed as:

```bash
helm upgrade -i -n db --create-namespace rqlite rqlite/rqlite -f values.yaml -f secrets://secrets.yaml
```

## Cluster Scaling and Recovery

Nodes come in two flavors: voting and read-only (non-voting).

### Scaling Voting Nodes

The chart value `replicaCount` dictates the number of voting nodes in the cluster. It's
strongly recommended that voting nodes only be scaled up or down by updating this value
and redeploying the chart (via `helm upgrade`), because the replica count is used in
multiple places in the chart (such as the PodDisruptionBudget).

Scaling voting nodes up can be done simply by increasing `replicaCount` and running `helm
upgrade`. The new nodes will mount a fresh PV, join the cluster, and synchronize the data
before receiving requests via the K8s Service.

On the other hand, scaling voting nodes *down* should follow rqlite's [documented
procedure for removing or replacing a
node](https://rqlite.io/docs/clustering/general-guidelines/#removing-or-replacing-a-node).
You can't simply decrease `replicaCount` and be done with it, because once voting nodes
have joined the rqlite cluster, the rest of the nodes in the cluster will be expecting
them to re-join until they are explicitly removed.

The basic procedure for scaling down the voting nodes is:
1. Redeploy the chart with the updated `replicaCount` to shrink the StatefulSet
2. Use the `rqlite` CLI or the HTTP API to remove each node that was dropped

It's important to shrink the voters in this order. Although the rest of the cluster will
complain loudly in the logs about the missing nodes until step 2 is completed, running the
procedure in reverse will cause transactions to be routed to the removed (now leaderless)
nodes until they eventually fail their readiness probes, where clients issuing those
requests will experience HTTP 503 errors.

One caveat to this order: you must ensure you never remove more than ceiling(N/2)-1 nodes
at a time, otherwise quorum will be lost.

For example, suppose you've deployed the chart with the release name `rqlite` in a
namespace called `db`, and you have a 5-node cluster and want to shrink to 3 nodes. First,
reinstall the chart with the lower replica count:

```bash
# In practice you'll more likely update your custom values.yaml
$ helm upgrade -n db rqlite rqlite/rqlite --set replicaCount=3
```

Then you can administratively drop the last 2 nodes from the rqlite cluster:

```bash
# Connect to the first voting note of the cluster.
$ kubectl exec -n db rqlite-0 -ti -- /bin/sh
# You will need to include additional arguments if you've enabled
# user authentication (-u user:password) or client TLS (-s https).
~ $ rqlite
Welcome to the rqlite CLI.
Enter ".help" for usage hints.
Connected to https://127.0.0.1:4001 running version v8.14.1
127.0.0.1:4001> .remove rqlite-4
127.0.0.1:4001> .remove rqlite-3
```

Note that the node ids are the pod names. If you deployed the chart with a release name
`rqlite-myapp` instead, then the node ids would be `rqlite-myapp-3` and `rqlite-myapp-4`.


#### Recovering From Permanent Loss of Quorum

If you lost enough nodes to the point where quorum can't be satisfied *and* the PVs for
those pods were also lost (because otherwise you could simply scale back up to restore
quorum), you will need to perform [rqlite's quorum recovery
procedure](https://rqlite.io/docs/clustering/general-guidelines/#recovering-a-cluster-that-has-permanently-lost-quorum).

rqlite's Helm chart provides a mechanism to handle this with the `useStaticPeers` chart
value. During normal operation, `useStaticPeers` should be `false`, in which case rqlite
will use DNS provided by Kubernetes for peer discovery.

However, in the event that quorum can't be recovered, you can set `useStaticPeers` to
`true` temporarily, perform a rolling restart on all nodes in the cluster, and set it back
to `false`. Changing this value only updates a ConfigMap, so it won't trigger an unwanted
rollout of the StatefulSet when changing back to `false`.

For example, assume your deployment's usual values are in `values.yaml`, and again
assuming your release is called `rqlite` in the `db` namespace:

```bash
# Upgrade the chart with the useStaticPeers recovery setting
$ helm upgrade -n db rqlite rqlite/rqlite -f values.yaml --set useStaticPeers=true
# Restart all pods in the rqlite cluster. You can remove the last argument if
# you don't have readonly pods.
$ kubectl rollout -n db restart statefulset/rqlite statefulset/rqlite-readonly
```

At this point, once the pods restart and quiesce, quorum should be restored. As a final
step, don't forget to revert the `useStaticPeers` setting simply by redeploying the chart
using your original values without the override:

```bash
$ helm upgrade -n db rqlite rqlite/rqlite -f values.yaml
```

This last command won't restart the rqlite pods, only prevent the use of `peers.json` on
the next restart, which includes if an existing rqlite pod crashes and is restarted by the
Kubelet.


### Scaling Read-only Nodes

Read-only nodes don't participate in quorum, and the Helm chart deploys them such that
they will automatically leave the cluster on shutdown. This means the read-only
StatefulSet can be scaled up and down arbitrarily, and can even be driven by the
Horizontal Pod Autoscaler if you choose.

The chart value `readonly.replicaCount` defines the initial number of read-only replicas,
and can thereafter be dynamically scaled, either by running `kubectl scale`, using HPA, or
some other orchestrator.


## Versioning

Helm charts use semantic versioning, and rqlite's chart offers the following guarantees:
* Breaking changes will only be introduced in new major versions
* where "breaking" is defined as you needing to modify the Helm chart values to avoid
breaking your deployment, or when the chart points to a version of rqlite which
itself contains breaking changes (such as non-backward-compatible API changes)
* New features or non-breaking changes will be introduced in minor versions
* note that changes that result in a rolling restart to the rqlite cluster are fair
game, they are not considered breaking
* Releases containing only bug fixes or trivial features will be introduced in patch
releases

This approach extends to updates to rqlite itself: if rqlite releases a new minor version
(8.14.x to 8.15.0, say) and the chart's *only* update is to point to this new version, the
chart will be given a minor version increase rather than a patch-level increase, despite
the trivial nature of the change to the chart itself.


## License

[MIT License](./LICENSE).
49 changes: 49 additions & 0 deletions charts/rqlite/templates/NOTES.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
{{/* vi: ft=helm.mustache */}}
{{- $name := tpl (include "rqlite.fullname" .) $ -}}
{{/* Construct URL for in-cluster access */}}
{{- $scheme := .Values.config.tls.client.enabled | ternary "https" "http" }}
{{- $svcurl := printf "%s://%s.%s.svc.cluster.local" $scheme $name .Release.Namespace }}
{{/*
If auth is enabled, grab (somewhat randomly) the first username in the users list for the
demo curl command. It's not guaranteed that user will have query permissions, but that's
probably an edge case.
*/}}
{{- $curlopts := (empty .Values.config.users) | ternary "" (
printf "-u %s:<password> " (get (.Values.config.users | first) "username" | default "<user>")
)
}}
{{- $curlcmd := printf "%s'%%s/db/query?pretty' --data-urlencode 'q=select unixepoch(\"subsecond\")'" $curlopts -}}

Deployment summary:

Version: {{ .Values.image.tag | default .Chart.AppVersion }}
Nodes: {{ .Values.replicaCount }}
Auth: {{ (empty .Values.config.users) | ternary "disabled (!)" "enabled" }}
Endpoints: {{ $svcurl }} (in-cluster)
{{- /*
If ingress is enabled, show the URLs for any defined hosts. If hosts weren't defined (user
elected to use the ingress default) then we don't know what FQDN to use, so skip it. The
URL path up to (but not including) the first bracket is displayed, which is a small
heuristic in case the user has defined a regexp-based path for URL rewriting.
*/ -}}
{{- if .Values.ingress.enabled -}}
{{- range .Values.ingress.hosts -}}
{{- $url := printf "https://%s%s" . (splitList "(" $.Values.ingress.path | first) -}}
{{- $_ := set $ "ingressurl" $url -}}
{{- $url | nindent 17 }} (ingress)
{{- end -}}
{{- end }}
Internode TLS: {{ .Values.config.tls.node.enabled | ternary "yes" "no" }}
Client TLS: {{ .Values.config.tls.client.enabled | ternary "yes" "no" }}

On another pod inside the cluster, you can issue a test query using curl:

$ curl -Gks {{ printf $curlcmd $svcurl }}

{{- if and .Values.ingress.enabled .Values.ingress.hosts }}

Or outside the cluster via the defined ingress:

$ curl -Gs {{ printf $curlcmd $.ingressurl }}

{{- end }}
Loading

0 comments on commit cedad68

Please sign in to comment.