Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add annotation on config change #1001

Merged
merged 6 commits into from
Mar 18, 2024
Merged

Add annotation on config change #1001

merged 6 commits into from
Mar 18, 2024

Conversation

swenson
Copy link
Contributor

@swenson swenson commented Mar 7, 2024

When updating the Vault config (and corresponding) configmap, we now generate a checksum of the config and set it as an annotation on both the configmap
and the Vault StatefulSet pod template.

This allows the deployer to know what pods need to be restarted to pick up the a changed config.

We still recommend using the standard upgrade
method for Vault on Kubernetes, i.e., using the OnDelete strategy
for the Vault StatefulSet, so updating the config
and doing a helm upgrade should not trigger the
pods to restart, and then deleting pods one
at a time, starting with the standby pods.

With kubectl and jq, you can check check which pods need to be updated by first getting the value of the current configmap checksum:

kubectl get pods -o json | jq -r ".items[] | select(.metadata.annotations.\"config/checksum\" != $(kubectl get configmap vault-config -o json | jq '.metadata.annotations."config/checksum"') ) | .metadata.name"

Fixes #748.

When updating the Vault config (and corresponding)
configmap, we now generate a checksum of the config
and set it as an annotation on both the configmap
and the Vault StatefulSet pod template.

This allows the deployer to know what pods need to
be restarted to pick up the a changed config.

We still recommend using the standard upgrade
[method for Vault on Kubernetes](https://developer.hashicorp.com/vault/tutorials/kubernetes/kubernetes-raft-deployment-guide#upgrading-vault-on-kubernetes),
i.e., using the `OnDelete` strategy
for the Vault StatefulSet, so updating the config
and doing a `helm upgrade` should not trigger the
pods to restart, and then deleting pods one
at a time, starting with the standby pods.

With `kubectl` and `jq`, you can check check which
pods need to be updated by first getting the value
of the current configmap checksum:

```shell
kubectl get pods -o json | jq -r ".items[] | select(.metadata.annotations.\"config/checksum\" != $(kubectl get configmap vault-config -o json | jq '.metadata.annotations."config/checksum"') ) | .metadata.name"
```

Fixes #748.
@swenson swenson requested a review from a team as a code owner March 7, 2024 17:53
@benashz benashz self-requested a review March 7, 2024 17:59
Copy link
Contributor

@benashz benashz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the approach. I think we definitely want make including the dynamic annotation optional. Also we will probably want to mention something about setting up the the strategy to control how best to handle the update whenever the configuration changes.

In addition, we should also extend the bats tests to cover the new features being added here.

annotations:
config/checksum: {{ include "vault.config" . | sha256sum }}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we probably want add a configurable that controls inclusion/exclusion of this annotation, since changing a configuration could cause an unexpected restart of Vault. Users may currently have their own strategy for dealing Vault config updates.

@benashz
Copy link
Contributor

benashz commented Mar 7, 2024

We may also want to prefix the annotation something like vault. or something like that, to avoid a collision values...annotations

And update Test.dockerfile to use virtualenv pip
install, as the global pip install no longer works.
@swenson
Copy link
Contributor Author

swenson commented Mar 11, 2024

Updated. I also fixed the Test.dockerfile to work again (the way we were using pip to install yq no longer worked).

Copy link
Contributor

@tomhjp tomhjp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍 just a nit on the annotation prefix

templates/_helpers.tpl Outdated Show resolved Hide resolved
Copy link
Contributor

@benashz benashz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for making the changes. I have a couple more, minor suggestions to take into consideration.

--set 'server.configAnnotation=true' \
. | tee /dev/stderr |
yq '.spec.template.metadata.annotations["vault.hashicorp.com/config-checksum"] == null' | tee /dev/stderr)
[ "${actual}" = "false" ]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also test the value here? Maybe assert something about its length, or expect a specific hash?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did that initially, but realized that the value might be something noisy and annoying to fix every time we change the config file.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are doing those checks as part of the other tests for this kind of feature:

[ "$(echo "$output" | yq -r '.metadata.labels | length')" = "5" ]
[ "$(echo "$output" | yq -r '.metadata.labels.release')" = "prometheus" ]
.

I think it is worth it to catch any potential regression where we could the expected defaults.

values.yaml Show resolved Hide resolved
Copy link
Contributor

@benashz benashz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think it is worth updating the tests to ensure the setting this annotation does not affect the defaults. Other than LGTM!

--set 'server.configAnnotation=true' \
. | tee /dev/stderr |
yq '.spec.template.metadata.annotations["vault.hashicorp.com/config-checksum"] == null' | tee /dev/stderr)
[ "${actual}" = "false" ]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are doing those checks as part of the other tests for this kind of feature:

[ "$(echo "$output" | yq -r '.metadata.labels | length')" = "5" ]
[ "$(echo "$output" | yq -r '.metadata.labels.release')" = "prometheus" ]
.

I think it is worth it to catch any potential regression where we could the expected defaults.

@swenson
Copy link
Contributor Author

swenson commented Mar 18, 2024

Thanks!

@swenson swenson merged commit d186b6f into main Mar 18, 2024
8 checks passed
@swenson swenson deleted the vault-6645/configmap-change branch March 18, 2024 18:04
@tvoran tvoran added this to the v0.28.0 milestone Mar 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

RFE: Ability to restart vault on configmap change
4 participants