diff --git a/.github/actions/spelling/expect.txt b/.github/actions/spelling/expect.txt index 102bea5830..4a5f3dd6a4 100644 --- a/.github/actions/spelling/expect.txt +++ b/.github/actions/spelling/expect.txt @@ -40,6 +40,7 @@ appcreationrequest APPNAME appv appversion +applicationdeployment aquasecurity architecting ARCHS @@ -205,6 +206,7 @@ ghtoken ginkgotypes giscus Gitlab +gitops gke glasskube gms @@ -387,6 +389,7 @@ linenums linkedin linkedtrace livenessprobe +loadtests LOCALBIN logf logr diff --git a/docs/blog/posts/multi-stage-delivery-using-gitops.md b/docs/blog/posts/multi-stage-delivery-using-gitops.md new file mode 100644 index 0000000000..44696f2b22 --- /dev/null +++ b/docs/blog/posts/multi-stage-delivery-using-gitops.md @@ -0,0 +1,409 @@ +--- +date: 2024-03-26 +authors: [bacherfl] +description: > + This blog post explains how to use the promotion phase of Keptn to implement a multi stage delivery GitOps workflow. +categories: + - GitOps + - Observability +comments: true +--- + +# Multi Stage Delivery using GitOps + +In multi-stage environments it can be a challenge to see +how a particular version of a workload progresses through different stages. +This can make it difficult to precisely trace exactly which modification introduced a problem +when something goes wrong in one of the deployment stages. + +Keptn helps to address this challenge +by providing a distributed OpenTelemetry trace that encompasses +all deployment stages and contains all relevant information, +such as the git commit ID that triggered the deployment of a workload. +For example, if the evaluation of a load test in one of the deployment stages +is failing, the distributed trace generated by Keptn contains +details about the result of the evaluation, as well as a link to the +deployment trace of the previous stage. +This makes it easy to trace back the deployment of that particular workload +across the previous stages, right until the original commit that resulted in +the performance degradation. + +This blog post demonstrates an example workflow that automates the promotion +of a sample application across two different stages. +The deployment traces of those stages are linked together and enriched +with valuable metadata, such as the commit ID that triggered the deployment +of a new workload version. + + + +## Technologies used for this example + +For this, we are using the following technologies: + +- The new [KeptnAppContext](../../docs/reference/crd-reference/appcontext.md) +resource + that allows to pass metadata to the generated deployment traces, and define a `promotion` + task that is executed once the application is deployed and all post deployment checks have been + executed successfully. +- [ArgoCD](https://argoproj.github.io/cd/) as a GitOps tool. + In addition to automatically synchronising the cluster with the desired + state of the cluster, ArgoCD also adds metadata (such as the git commit ID + that triggered the last sync) to the `KeptnAppContext` CRD. +- [GitHub Actions](https://github.com/features/actions): The GitOps + repository is hosted on GitHub so we can use GitHub Actions + to implement the promotion of an artifact from one stage to the next. + We do this by running a workflow that creates the pull requests for updating the + application manifests in the different stages. +- [Helm](https://helm.sh): The configuration of the application for each stage + is maintained via two separate Helm charts, one for each stage. +- [OpenTelemetry Collector](https://opentelemetry.io/docs/collector/)/[Jaeger](https://www.jaegertracing.io): + The deployment traces are gathered by the OpenTelemetry Collector and forwarded to Jaeger, +which displays the generated traces graphically. +- [Prometheus](https://prometheus.io): Provides monitoring data for the application. + +Note that for this blog post, we are assuming that these tools are already +installed on the Kubernetes cluster, as going through the installation +of each of those would exceed the scope of this blog post. + +We are going to do the following: + +1. Set up the environment by: + 1. Setting up a GitHub repository with an access token, GitHub, workflows, and GitHub actions + 1. Preparing a Kubernetes namespace for each stage (`dev` and `prod`) + 1. Preparing the ArgoCD applications with an appropriate Helm chart for each + 1. Applying labels to associate the `Deployment` resource with the `KeptnWorkload` resource + 1. Defining Keptn pre-/post-deployment checks and tasks + 1. Defining the metadata to be passed through the deployment traces + 1. Define a TraceParent that links the deployment traces of the `prod` stage to those of the `dev` stage +1. Run the promotion flow by: + 1. Creating a pull request to update our `dev` environment + 1. Merging the automatically created pull request to promote the updated version into `prod` + 1. Inspecting the generated deployment traces for both stages + and see how they are connected with each other + +## Setting up the Environment + +Now it's time to set up our environment and connect all the tools mentioned above +with each other. + +### Set up the GitHub repository + +First things first, since we talk about GitOps in this article, we need +a git repository to host the Helm chart of our application. +We use GitHub in this example, which allows us +to use GitHub Actions to implement the promotion from +`dev` to `production`. + +In this example, we are using [this repository](https://github.com/bacherfl/keptn-analysis-demo) +as an upstream repository. +If you would like to try the demo yourself, feel free to fork this repository +and start experimenting with Keptn from there. + +#### Create personal access token + +We need to create a personal access token for accessing the GitHub API. +This token will be used by the container running the promotion +task during the post-deployment phase of the `KeptnApp` within +the `dev` stage. +The container uses this access token to trigger a GitHub action +that creates a pull request to promote the version that has +been deployed from `dev` into `production`. +Using GitHub actions rather than interacting directly with the Git repository in the container that executes the promotion +step lets us avoid granting the container any write permissions to the +repository. + +Instead, we use an access token with a restricted set of permissions, +so we can use of GitHub's [fine-grained access tokens](https://github.blog/2022-10-18-introducing-fine-grained-personal-access-tokens-for-github/) +to restrict the permissions to only be able to trigger workflow actions, +exclusively within our GitOps repository. +The required permissions are highlighted in the screenshot below: + +![Token Permissions](./multi-stage-delivery-using-gitops/token-permissions.png) + +#### Enable GitHub workflows + +We also need to enable GitHub workflows to write to the repo +and create pull requests. +This is done in the settings of the repository; see the screenshot below: + +![Workflow Permissions](./multi-stage-delivery-using-gitops/workflow-permissions.png) + +The GitHub action performing the promotion is implemented +in the `.github/workflows/promote.yaml` file +located in our GitOps repository. + +```yaml +{% include "./multi-stage-delivery-using-gitops/promote.yaml" %} +``` + +This action copies the `values.yaml` file from the +`dev` stage to the `prod` stage, to set the +service versions that should be deployed via the Helm chart for +that stage. + +### Prepare the application namespaces + +In this example, the application will be deployed in +two different namespaces, each representing a different +stage (`dev` and `prod`). +To create the namespaces, execute the following commands: + +```shell +kubectl create namespace simple-go +kubectl annotate namespace simple-go keptn.sh/lifecycle-toolkit=enabled +kubectl create namespace simple-go-prod +kubectl annotate namespace simple-go-prod keptn.sh/lifecycle-toolkit=enabled +``` + +The promotion task that triggers the action to +create a pull request for promoting an application version +from `dev` to `production` will be executed in the `simple-go` namespace. +Therefore, we need to create a secret containing the GitHub personal +access token we created earlier, using the following command: + +```shell +GH_REPO_OWNER= +GH_REPO= +GH_API_TOKEN= +kubectl create secret generic github-token -n simple-go --from-literal=SECURE_DATA="{\"githubRepo\":\"${GH_REPO}\",\"githubRepoOwner\":\"${GH_REPO_OWNER}\",\"apiToken\":\"${GH_API_TOKEN}\"}" +``` + +### Prepare the ArgoCD applications + +The next step is to +create the ArgoCD applications in our cluster. +Each stage of our application (`dev` and `prod`) is +represented by a separate ArgoCD application that points to +a Helm chart for the respective stage. +The Helm charts can be found in our [GitOps repository](https://github.com/bacherfl/keptn-analysis-demo) +in the following sub folders: + +- `simple-app/chart-dev`: Contains the Helm chart for the application in the `dev` stage +- `simple-app/chart-prod`: Contains the Helm chart for the application in the `prod` stage + +The ArgoCD applications are created by applying the following manifest: + +```yaml title="argo-apps.yaml" +{% include "./multi-stage-delivery-using-gitops/argo-apps.yaml" %} +``` + +This manifest contains the definitions for the two ArgoCD applications +each of which points to one of the helm charts mentioned earlier. +In addition to that, the `$ARGOCD_APP_REVISION` environment variable +is used to get access to the git commit ID that triggered +a new deployment of our applications. +This ID is passed through to the Helm chart. +Keptn uses this as metadata for a `KeptnApp` deployment. + +After applying the file, using `kubectl apply -f argo-apps.yaml`, +ArgoCD begins to synchronize the state of the applications, +meaning that the Helm charts for the applications are applied to the +cluster. +While this is happening, let's have a closer look at the actual +content of the Helm charts. + +### What's in the Helm chart for dev stage + +Each chart contains two `Deployments/Services` +(`simple-go-service` and `simple-go-backend`), representing +the two `KeptnWorkloads` that are part of our `KeptnApp`. +Let's take the `simple-go-service` `Deployment` as an example +to see how we prepared it to be managed by Keptn: + +```yaml +{% include "./multi-stage-delivery-using-gitops/deployment.yaml" %} +``` + +#### Labels + +To correctly associate the `Deployment` resource with the `KeptnWorkload` resource, +the following labels are set: + +- `app.kubernetes.io/name`: The name of the `KeptnWorkload` that should be associated with the `Deployment`. +- `app.kubernetes.io/part-of`: The name of the `KeptnApp` containing the two workloads. +- `app.kubernetes.io/version`: The version for the related `KeptnWorkload`. + +#### Pre and post-deployment tasks + +In addition to the labels which define the `KeptnWorkload`, we also use +the `keptn.sh/post-deployment-tasks` to define a post-deployment task for the +workload. +The task defined here (`wait-for-monitoring`) ensures that the Prometheus +target for the workload is available, before proceeding with +the execution of the load tests of the overall application. + +#### KeptnAppContext + +The `KeptnAppContext` provides two important capabilities for multi-stage delivery: + +- Define tasks and evaluations that run before or after the application deployment +- Add metadata and links to traces for a specific application. +This enables you to enrich your traces +with additional information that you can use +to understand and analyze the performance of your applications +which looks as follows: + + ```yaml + {% include "./multi-stage-delivery-using-gitops/keptnappcontext.yaml" %} + ``` + +This resource contains a list of pre- and post-deployment checks +for the complete application. +In the `pre-deployment` phase, the task `wait-for-monitoring` +ensures the Prometheus installation in our cluster is available. +If this is not the case, it would not be wise to deploy a new +version of the application, since we cannot observe the +performance metrics of our application. + +Once all workloads have been deployed, the application enters the +`post-deployment` phase, in which load tests against the application are +executed. + +After executing the load tests, a `post-deployment` evaluation is +performed, to compare the response time measured by the load tests with a threshold you have defined. + +If this evaluation is successful, the application proceeds into the +`promotion` phase. +This is the phase where the GitHub personal access token we created earlier +is used to trigger the GitHub action to promote the deployed version +into the next stage. + +#### Metadata + +In addition to the pre-/post-deployment checks and the promotion task, +the `KeptnAppContext` also contains a `metadata` property that +passes the `commitID` made available by ArgoCD to the +application deployment. +This information is then added by Keptn as an attribute to the +OpenTelemetry traces created for the application deployment. + +To configure the application, the `values.yaml` +file is used. +Within that file, the versions for the two workloads that +are part of the application are defined, +as well as the target response time for the evaluation +in the post-deployment phase. +The Git commit ID mentioned earlier is also set here. +The Git commit ID is empty by default, but is set automatically +by ArgoCD, using the `$ARGOCD_APP_REVISION` environment variable. + +```yaml +{% include "./multi-stage-delivery-using-gitops/values-dev.yaml" %} +``` + +### What's in the Helm chart for prod stage + +The Helm chart of the `prod` stage is rather similar to the one +for the `dev` stage, but differs in the `values.yaml`, and the +`KeptnAppContext`. +First, let's inspect the `values.yaml` in `prod`: + +```yaml +{% include "./multi-stage-delivery-using-gitops/values-prod.yaml" %} +``` + +#### TraceParent property + +The `values.yaml` file for the `prod` stage +contains an additional property called `traceParent`, +which is essential in linking the deployment traces of the +`prod` stage to the previous stage, i.e. the `dev` stage. +The `traceParent` is propagated from Keptn to the GitHub action that +does the promotion by adapting the `values.yaml` file to +specify the workload versions that should be deployed in `prod`. + +#### spanLinks property + +In our example, the value of the `traceParent` is the span ID of the +`promotion` phase of the `dev` stage. +To pass this property to Keptn, the `spanLinks` property of the `KeptnAppContext` +below is used: + +```yaml +{% include "./multi-stage-delivery-using-gitops/keptnappcontext-prod.yaml" %} +``` + +This causes the OpenTelemetry deployment trace in `prod` to have a reference +to the `promotion` phase in `dev`, indicating that the successful deployment +of the application in `dev` is what caused the deployment in `prod`. + +## Promotion flow from `dev` to `prod` + +Now that the GitOps repository and the ArgoCD application are set up, +let's have a closer look at how a new service version would make its way +into `dev` and then into `prod`. +To do this, the `values.yaml` file for the `dev` stage is edited to +change th service version from `v1` to `v2`: + +![Updating the dev stage](./multi-stage-delivery-using-gitops/updating-dev.png) + +After this change is committed to the GitOps repository, ArgoCD +eventually starts to synchronize the application, and the new service +version is deployed to `dev`. +This is reflected by a new `KeptnAppVersion` being created by Keptn, +for which the pre-/post-deployment checks +and the evaluation that were mentioned earlier are executed. +After some time, the new version is up and running in `dev` +and the deployment trace for the new `KeptnAppVersion` is +visible in Jaeger: + +![Deployment trace dev](./multi-stage-delivery-using-gitops/deployment-trace-dev.png) + +You can see that the generated trace also contains the commitID that triggered +the deployment (i.e. the commit in which the version was changed). +We also see that the `promotion` phase has been executed successfully, so let's +check our GitOps repository and inspect the automatically created pull request +to promote the version into the next stage: + +![PR to update prod](./multi-stage-delivery-using-gitops/pr-dev-to-prod.png) + +As expected, the pull request updates the `values.yaml` file for the +`prod` stage to update the `serviceVersion` to the same value we just +deployed in `dev`. +In addition to that, the `traceParent` property is set to the +span ID of the `promotion` phase of the deployment in `dev`. + +Once the PR is merged, Keptn deploys +the new version in the `prod` stage, and eventually +we will see the deployment trace for that stage in +Jaeger as well: + +![Deployment trace prod](./multi-stage-delivery-using-gitops/deployment-trace-prod.png) + +As we can see in the deployment trace, we also have the commitID that triggered +the deployment in that stage, just like we also had in `dev`, but +in addition to that the trace also contains a reference +to the span ID of the `promotion` phase in `dev`. +This ultimately allows us to trace back the deployment of a particular +service version across multiple stages, right to the commit that +introduced a change to the affected service. + +## Conclusion + +Time to wrap up what we have learned in this example. +We have seen how the `KeptnAppContext` resource +can be used to define pre-/post-deployment checks and to +pass important metadata - in our example, using ArgoCD, +the git commit ID that triggered a new deployment - +to be added as attributes to the deployment traces +generated by Keptn. +Then, to gain observability not only for an isolated stage, +but across multiple stages, the `spanLinks` property +of the `KeptnAppContext` was used to create references to +deployment traces of a previous stage when +promoting a new version of a service from one stage to the next. +This way, if any kind of problems appear in one of the later +stages (in this example in the `prod` stage) for a newly deployed version, +the links to the deployment traces of the previous stages +enable us to trace back the deployment of that new version +across the previous stages, until we reach the commit that +caused the erroneous behavior of that service. + +We hope the example in this blog post gives you some inspiration +on how you could implement Keptn into your continuous delivery +workflow. +If you would like to try out Keptn and its capabilities yourself, +feel free to head over to the [Keptn docs](https://lifecycle.keptn.sh/docs/) +and follow the guides to [install Keptn](https://lifecycle.keptn.sh/docs/install/). +We also appreciate any feedback and are always happy to support you +with any questions you might have. diff --git a/docs/blog/posts/multi-stage-delivery-using-gitops/argo-apps.yaml b/docs/blog/posts/multi-stage-delivery-using-gitops/argo-apps.yaml new file mode 100644 index 0000000000..ccfa603e38 --- /dev/null +++ b/docs/blog/posts/multi-stage-delivery-using-gitops/argo-apps.yaml @@ -0,0 +1,49 @@ +apiVersion: argoproj.io/v1alpha1 +kind: Application +metadata: + name: simple-go-app-context + namespace: argocd +spec: + project: default + source: + repoURL: 'https://github.com/bacherfl/keptn-analysis-demo' + path: simple-app/chart-dev + targetRevision: HEAD + helm: + parameters: + - name: "commitID" + value: "$ARGOCD_APP_REVISION" + destination: + server: 'https://kubernetes.default.svc' + namespace: simple-go + syncPolicy: + automated: + prune: true + selfHeal: true + syncOptions: + - CreateNamespace=true +--- +apiVersion: argoproj.io/v1alpha1 +kind: Application +metadata: + name: simple-go-app-context-prod + namespace: argocd +spec: + project: default + source: + repoURL: 'https://github.com/bacherfl/keptn-analysis-demo' + path: simple-app/chart-prod + targetRevision: HEAD + helm: + parameters: + - name: "commitID" + value: "$ARGOCD_APP_REVISION" + destination: + server: 'https://kubernetes.default.svc' + namespace: simple-go + syncPolicy: + automated: + prune: true + selfHeal: true + syncOptions: + - CreateNamespace=true diff --git a/docs/blog/posts/multi-stage-delivery-using-gitops/deployment-trace-dev.png b/docs/blog/posts/multi-stage-delivery-using-gitops/deployment-trace-dev.png new file mode 100644 index 0000000000..d061d85c06 Binary files /dev/null and b/docs/blog/posts/multi-stage-delivery-using-gitops/deployment-trace-dev.png differ diff --git a/docs/blog/posts/multi-stage-delivery-using-gitops/deployment-trace-prod.png b/docs/blog/posts/multi-stage-delivery-using-gitops/deployment-trace-prod.png new file mode 100644 index 0000000000..8c7f1bd26a Binary files /dev/null and b/docs/blog/posts/multi-stage-delivery-using-gitops/deployment-trace-prod.png differ diff --git a/docs/blog/posts/multi-stage-delivery-using-gitops/deployment.yaml b/docs/blog/posts/multi-stage-delivery-using-gitops/deployment.yaml new file mode 100644 index 0000000000..4a47dfb73a --- /dev/null +++ b/docs/blog/posts/multi-stage-delivery-using-gitops/deployment.yaml @@ -0,0 +1,26 @@ +apiVersion: apps/v1 +kind: Deployment +metadata: + name: simple-go-service + namespace: simple-go +spec: + selector: + matchLabels: + app: simple-go-service + template: + metadata: + labels: + app: simple-go-service + app.kubernetes.io/name: simple-go-service + app.kubernetes.io/part-of: simple-go + app.kubernetes.io/version: {{.Values.serviceVersion}} + keptn.sh/post-deployment-tasks: wait-for-monitoring + spec: + containers: + - image: bacherfl/simple-go-service:{{.Values.serviceVersion}} + imagePullPolicy: Always + name: simple-go-service + ports: + - containerPort: 9000 + name: http + protocol: TCP diff --git a/docs/blog/posts/multi-stage-delivery-using-gitops/keptnappcontext-prod.yaml b/docs/blog/posts/multi-stage-delivery-using-gitops/keptnappcontext-prod.yaml new file mode 100644 index 0000000000..89bac76794 --- /dev/null +++ b/docs/blog/posts/multi-stage-delivery-using-gitops/keptnappcontext-prod.yaml @@ -0,0 +1,16 @@ +apiVersion: lifecycle.keptn.sh/v1beta1 +kind: KeptnAppContext +metadata: + name: simple-go-prod + namespace: simple-go-prod +spec: + + preDeploymentTasks: + - wait-for-prometheus + postDeploymentTasks: + - post-deployment-loadtests + - post-deployment-loadtests-backend + spanLinks: + - {{.Values.traceParent}} + metadata: + commitID: {{.Values.commitID}} diff --git a/docs/blog/posts/multi-stage-delivery-using-gitops/keptnappcontext.yaml b/docs/blog/posts/multi-stage-delivery-using-gitops/keptnappcontext.yaml new file mode 100644 index 0000000000..2d44f3b2e7 --- /dev/null +++ b/docs/blog/posts/multi-stage-delivery-using-gitops/keptnappcontext.yaml @@ -0,0 +1,17 @@ +apiVersion: lifecycle.keptn.sh/v1beta1 +kind: KeptnAppContext +metadata: + name: simple-go + namespace: simple-go +spec: + preDeploymentTasks: + - wait-for-prometheus + postDeploymentTasks: + - post-deployment-loadtests + - post-deployment-loadtests-backend + postDeploymentEvaluations: + - response-time + promotionTasks: + - promote + metadata: + commitID: {{.Values.commitID}} diff --git a/docs/blog/posts/multi-stage-delivery-using-gitops/pr-dev-to-prod.png b/docs/blog/posts/multi-stage-delivery-using-gitops/pr-dev-to-prod.png new file mode 100644 index 0000000000..dad5ac55a9 Binary files /dev/null and b/docs/blog/posts/multi-stage-delivery-using-gitops/pr-dev-to-prod.png differ diff --git a/docs/blog/posts/multi-stage-delivery-using-gitops/promote.yaml b/docs/blog/posts/multi-stage-delivery-using-gitops/promote.yaml new file mode 100644 index 0000000000..a2c67f6f19 --- /dev/null +++ b/docs/blog/posts/multi-stage-delivery-using-gitops/promote.yaml @@ -0,0 +1,46 @@ +name: promote + +on: + workflow_dispatch: + inputs: + traceParent: + description: 'OTEL parent trace' + required: false + type: string + +permissions: + contents: write + pull-requests: write + +jobs: + promote: + runs-on: ubuntu-latest + steps: + - uses: actions/checkout@v3 + - run: | + # configure git client + git config --global user.email "" + git config --global user.name "" + + # create a new branch + git switch -c production/${{ github.sha }} + + # promote the change + cp dev/values.yaml production/values.yaml + + echo "traceParent: $TRACE_PARENT" >> production/values.yaml + + # push the change to the new branch + git add production/values.yaml + git commit -m "Promote dev to production" + git push -u origin production/${{ github.sha }} + env: + TRACE_PARENT: ${{ inputs.traceParent }} + - run: | + gh pr create \ + -B main \ + -H production/${{ github.sha }} \ + --title "Promote dev to production" \ + --body "Automatically created by GHA" + env: + GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} diff --git a/docs/blog/posts/multi-stage-delivery-using-gitops/token-permissions.png b/docs/blog/posts/multi-stage-delivery-using-gitops/token-permissions.png new file mode 100644 index 0000000000..90a01e822d Binary files /dev/null and b/docs/blog/posts/multi-stage-delivery-using-gitops/token-permissions.png differ diff --git a/docs/blog/posts/multi-stage-delivery-using-gitops/updating-dev.png b/docs/blog/posts/multi-stage-delivery-using-gitops/updating-dev.png new file mode 100644 index 0000000000..194983287e Binary files /dev/null and b/docs/blog/posts/multi-stage-delivery-using-gitops/updating-dev.png differ diff --git a/docs/blog/posts/multi-stage-delivery-using-gitops/values-dev.yaml b/docs/blog/posts/multi-stage-delivery-using-gitops/values-dev.yaml new file mode 100644 index 0000000000..837db20919 --- /dev/null +++ b/docs/blog/posts/multi-stage-delivery-using-gitops/values-dev.yaml @@ -0,0 +1,4 @@ +serviceVersion: v1 +backendServiceVersion: v1 +targetResponseTime: "0.50" +commitID: "" diff --git a/docs/blog/posts/multi-stage-delivery-using-gitops/values-prod.yaml b/docs/blog/posts/multi-stage-delivery-using-gitops/values-prod.yaml new file mode 100644 index 0000000000..b94e61de20 --- /dev/null +++ b/docs/blog/posts/multi-stage-delivery-using-gitops/values-prod.yaml @@ -0,0 +1,5 @@ +serviceVersion: v1 +backendServiceVersion: v1 +targetResponseRate: "0.50" +commitID: "" +traceParent: "" diff --git a/docs/blog/posts/multi-stage-delivery-using-gitops/workflow-permissions.png b/docs/blog/posts/multi-stage-delivery-using-gitops/workflow-permissions.png new file mode 100644 index 0000000000..c092ebda61 Binary files /dev/null and b/docs/blog/posts/multi-stage-delivery-using-gitops/workflow-permissions.png differ