diff --git a/ADR/0008-environment-provisioning.md b/ADR/0008-environment-provisioning.md index afcea53c..6775dc4f 100644 --- a/ADR/0008-environment-provisioning.md +++ b/ADR/0008-environment-provisioning.md @@ -4,7 +4,7 @@ Date: 2022-12-14 ## Status -Accepted +Superceded by [ADR 32. Decoupling Deployment](0032-decoupling-deployment.html) ## Approvers diff --git a/ADR/0014-let-pipelines-proceed.md b/ADR/0014-let-pipelines-proceed.md index a2367dcd..a67d8aa7 100644 --- a/ADR/0014-let-pipelines-proceed.md +++ b/ADR/0014-let-pipelines-proceed.md @@ -10,6 +10,7 @@ Accepted Relates to: * [ADR 13. AppStudio Test Stream - API contracts](0013-integration-service-api-contracts.html) * [ADR 30. Tekton Results Naming Convention](0030-tekton-results-naming-convention.html) +* [ADR 32. Decoupling Deployment](0032-decoupling-deployment.html) ## Context diff --git a/ADR/0016-integration-service-promotion-logic.md b/ADR/0016-integration-service-promotion-logic.md index b4a8aeac..3b5c0a3d 100644 --- a/ADR/0016-integration-service-promotion-logic.md +++ b/ADR/0016-integration-service-promotion-logic.md @@ -5,7 +5,7 @@ ## Status -Accepted +Superceded by [ADR 32. Decoupling Deployment](0032-decoupling-deployment.html) ## Context @@ -24,7 +24,7 @@ Note: This functionality has now been completely dropped from the build-service DevOps workflows often automate deployments of applications across different environments to ensure that the workloads are properly tested before being further promoted to an environment with a higher -service level agreement. +service level agreement. The promotion path can be represented with a directed acyclic graph from the environment with the lowest SLA to the one with the highest, for example development -> staging -> production. In AppStudio, this promotion logic would be represented by a set of components (container images) defined by @@ -76,7 +76,7 @@ push (merge-to-main) events gets promoted to lowest environments and released. ## Consequences * As per this decision, Integration Service now holds the full charge to automatically promote the Snapshot of the - Application to the user’s defined lowest environments only. + Application to the user’s defined lowest environments only. The integration service doesn't hold the control to make promotions to the non-lowest/production environments. * Once all the tests succeed the Snapshot will always be deployed via a single code path, in a single service. diff --git a/ADR/0022-secret-mgmt-for-user-workloads.md b/ADR/0022-secret-mgmt-for-user-workloads.md index bf645eca..5574d3dd 100644 --- a/ADR/0022-secret-mgmt-for-user-workloads.md +++ b/ADR/0022-secret-mgmt-for-user-workloads.md @@ -7,6 +7,10 @@ Date revised: 2023-08-29 Accepted +Relates to: + +- [ADR 32. Decoupling Deployment](0032-decoupling-deployment.html) + ## Context * When user workloads are deployed to environments, the system should be able to provide a way to inject values that are specific to the environment. Currently, this is done through environment variables that are managed as overlays on the GitOps repository for the application. However, this method does not provide a good way to manage `Secret`. This ADR addresses the secret management of user workloads for different environments. diff --git a/ADR/0028-handling-snapshotenvironmentbinding-errors.md b/ADR/0028-handling-snapshotenvironmentbinding-errors.md index c1cdcd4f..8e883d1b 100644 --- a/ADR/0028-handling-snapshotenvironmentbinding-errors.md +++ b/ADR/0028-handling-snapshotenvironmentbinding-errors.md @@ -2,7 +2,7 @@ Date: 2023-08-31 ## Status -Accepted +Superceded by [ADR 32. Decoupling Deployment](0032-decoupling-deployment.html) ## Context It is currently not possible to determine whether a SnapshotEnvironmentBinding (SEB) is stuck in an unrecoverable state. This is a major problem when deciding if an ephemeral SEB needs to be cleaned up by the integration service's SnapshotEnvironmentBinding controller. An inability to clean up errored SEBs can overload the cluster. diff --git a/ADR/0032-decoupling-deployment.md b/ADR/0032-decoupling-deployment.md new file mode 100644 index 00000000..18d2eda8 --- /dev/null +++ b/ADR/0032-decoupling-deployment.md @@ -0,0 +1,239 @@ +# 32. Decoupling Deployment + +Date started: 2023-11-17 +Date accepted: 2023-02-01 + +## Status + +Accepted + +Relates to: + +- [ADR 14. Let Pipelines Proceed](0014-let-pipelines-proceed.html) +- [ADR 22. Secret Management for User Workloads](0022-secret-mgmt-for-user-workloads.html) + +Supercedes: + +- [ADR 08. Environment Provisioning](0008-environment-provisioning.html) +- [ADR 16. Integration Service Promotion Logic](0016-integration-service-promotion-logic.html) +- [ADR 28. Handling SnapshotEnvironmentBinding Errors](0028-handling-snapshotenvironmentbinding-errors.html) + +## Authors + +- Ralph Bean + +## Context + +Since the beginning of our project, we've had an emphasis on providing an integrated experience for +the user, automating all steps through build, test, deployment, and release to higher environments. + +Some challenges: + +- Our controllers' APIs are highly coupled. For example, it isn't feasible to try and use + [integration-service] today without [application-service] and [gitops-service]. This creates + a high barrier to engage with some of the useful subsystems we've built. A user (and potential + contributor) has to adopt the whole thing, or nothing. +- Today, we can generate simple but often incorrect deployment manifests for users' applications, + which we deploy via the application gitops repo. The user cannot interact with or influence these + resources directly; they have to do so through the Application/Component API. If we permit them to + directly provide deployment resources through that API, the API will become a leaky abstraction. + If don't permit them to provide their own resources, and instead try to handle every application + configuration case in the HAS generation code, we will struggle to keep up with users' demands for + the many and varied kinds of application deployments. + +See also [RHTAP-1873](https://issues.redhat.com/browse/RHTAP-1873). + +## Decision + +We are going to decouple the deployment from the build, test, and release portions of our system. + +**Deployment**: + +- The [Environment], [SnapshotEnvironmentBinding], and [GitOpsDeploymentManagedEnvironment] + resources will be deprecated and eventually decomissioned. +- The [application-service] will stop generating GitOps repo content and stop creating GitOps + repositories. +- We will stop deploying the [gitops-service] entirely and the RDS instance it uses. +- HAC will stop rendering [Environments] and their status. +- If a user wants to make use of deployment capabilities, we will promote the usage of [renovatebot] + to propagate _released_ images to their self-managed gitops repo as pull requests. Dependabot is + equally viable if the user's gitops repo is on GitHub. + +**Test and promotion**: + +- We will decomission the [DeploymentTargetClaim], [DeploymentTarget], and [DeploymentTargetClass] + APIs in favor of the new [Dynamic Resource Allocation APIs] that are an alpha feature of Kubernetes + v1.27 and OpenShift 4.14. +- [integration-service] should no longer create and and manage [Environments] and the related + [DeploymentTargetClaims]. Users will be expected to provide integration test pipelines that (somehow) + specify `resourceClaims`, which will cause a provisioner to provision the compute and inject a + kubeconfig for the target compute into the taskrun pod. The user's test pipeline should then take + steps to *deploy an instance of their application to be tested* to the compute provided by dynamic + resource allocation provisioner, using using the provided kubeconfig. +- In the intervening time between now and when the Dynamic Resource Allocation APIs are available + (OpenShift 4.14 plus the time we need to implement a sandbox SpaceRequest *resource driver*), users + that need an ephemeral namespace for testing will need to employ a Task as the first step in their + pipeline that creates a SpaceRequest. They will need to use a finally task to clean up the SpaceRequest + after testing completes. +- We should promote [release-service] as the primary means to advertise to [renovatebot] that one or more + images have passed testing and are ready to be promoted to a particular environment (with a lowercase + "e") by way of image tags in a registry. + +### Out of scope + +Some other interesting ideas that are floating around these days, but which should be taken up in +other ADRs, if we take them up at all: + +- Decoupling [integration-service] and [release-service] from a shared [Snapshot] API. They will + continue to share a [Snapshot] API as of this ADR. +- Decoupling [integration-service] from the [application-service] APIs, like [Application] and + [Component]. It will still promote images to the "global candidate list" (on the [Component] + resources) and continue to use the list of [Components] to guide its construction of [Snapshots]. + +### Use Case Descriptions + +**During onboarding**: whereas today when a user requests a new appstudio tier namespace, the tier +template includes an [Environment] that the integration-service will promote to. Tomorrow, the +appstudio tier template should no longer include an [Environment] which on its own will cause +integration-service to _not_ trigger a deployment when testing completes. Instead, the appstudio +tier template should include a [ReleasePlan] with a reference to the [push-to-registry] +release pipeline. This new default [ReleasePlan] should carry parameters such that whenever +a [Snapshot] is successfully tested, a [Release] is created that re-tags the images in build-time +quay repositories with a tag like `:released` or `:validated` (name tbd). The +[push-to-registry] pipeline can use the `appstudio-pipeline` service account in the user's +dev workspace, which already has push access to the repository in question. + +```mermaid +flowchart TD + NSTierTemplate --> |provides| ReleasePlan + A[fa:fa-code-commit Git Commit] --> |webhook| PipelinesAsCode + PipelinesAsCode --> |creates| build[Build PipelineRun] + build --> |triggers| integration-service + integration-service --> |asks| decision{Is there a ReleasePlan?} + ReleasePlan --> |influences| decision + decision --> |yes| Release + build --> |pushes the image| quay.io/redhat-user-workloads/tenant/component + Release --> |tags the image for release| quay.io/redhat-user-workloads/tenant/component + quay.io/redhat-user-workloads/tenant/component --> renovatebot[User provided RenovateBot] + renovatebot --> |updates| gitops-repo[User provided GitOps repo] + gitops-repo --> |causes deployment via| argocd[User provided ArgoCD] + argocd --> |deploys to| cluster[User provided cluster] +``` + +Outside of the AppStudio member cluster, the user is responsible for acquiring a gitops repo and +deployment environments of their choice, manually laying out their application resources in the repo +(assisted by tools like `kam`), specifying image references by tag to match the `:released` or +`:validated` tagging schem mentioned above, configuring ArgoCD to deploy from their gitops repo, and +configuring [renovatebot] to propagate image updates by digest to their gitops repo. Really, ArgoCD +here is just an example and other gitops tools could be used; renovate could even update Helm repos +with the new images. Options for the user are not limited. + +**For manual creation of new environments** - the user manages this directly using a combination of +their gitops repo and argo, outside of the AppStudio member cluster. + +**For automated testing in ephemeral environments** - the user specifies an +[IntegrationTestScenario] CR, which references a pipeline which (somehow) creates a `resourceClaim`. +After a build completes, the [integration-service] creates the PipelineRun which causes the **resource +driver** associated with the `resourceClaim` to provision the requested compute. The resource driver +injects the kubeconfig for the ephemeral compute into the pod, to be used by the test TaskRun. The +user's test TaskRun is responsible for deploying the user's app based on the [Snapshot] provided by +[integration-service] as a parameter and the kubeconfig injected into the TaskRun pod by the resource +driver before running tests. The resource driver should cleanup after itself after the test TaskRun +and its corresponding pod complete. + +```mermaid +flowchart TD + User --> |provides| IntegrationTestScenario + User --> |provides| testpipeline + commit[fa:fa-code-commit Git Commit] --> |webhook| PipelinesAsCode + PipelinesAsCode --> |creates| build[Build PipelineRun] + build --> |triggers| integration-service + integration-service --> |consults| IntegrationTestScenario + IntegrationTestScenario --> |references| testpipeline + testpipeline --> |is used to create| testpipelinerun + testpipelinerun --> |prompts| driver[Resource Driver] + driver --> |creates| compute[Compute - cluster, namespace, or other external resource] + driver --> |creates| kubeconfig + kubeconfig --> |is injected into pod of| testpipelinerun[Test PipelineRun] + integration-service --> |creates| testpipelinerun[Test PipelineRun] + testpipelinerun --> |deploys user app to| compute + testpipelinerun --> |executes tests against| compute +``` + +## Consequences + +- Users who expect effortless deployment of their app when onboarding to the system will be + dissapointed. They have more work to do to set up a deployment of their app outside the system. +- Users will lose visibility of their applications' deployments and status in the AppStudio UI + (HAC). Other systems like the Argo UI are arguably better at this than we are. +- Users who expect to provide and manage their own resources to control their app will be delighted. + They now no longer have to interact with an intermediary API to try to express details about their + deployment(s). +- As a team, we'll be in a better position to try to achieve independence for [integration-service], + make it usable outside the context of AppStudio, and ideally make it attractive for collaborators. + +## Implementation + +Some of these phases can be done at the same time. + +- Work with the only known users of ephemeral environments right now (Exhort team) to create the + intermediary solution: a pair of tekton tasks that create and destroy SpaceRequests. Use this + in their pipelines to drop usage of the existing Environment-cloning feature set. +- Create a Dynamic Resource Allocation resource driver that supports SpaceBindings +- [integration-service]: Drop the environment reference from the [IntegrationTestScenario] spec, + and related controller code for managing ephemeral [Environments] for tests. +- [release-service]: Drop the environment reference from the [ReleasePlanAdmission] spec, and + related controller code for managing a [SnapshotEnvironmentBinding]. +- [HAC]: update [IntegrationTestScenario] no longer use [Environments]. +- [HAC]: Drop UI features showing the [Environments]: (commit view, Environments pane, etc.) +- [HAC]: Drop UI features differentiating "build" and "deploy" secrets. With this change, "deploy" + secrets are no longer relevant. +- [integration-service]: stop creating a [SEB] for the lowest [Environments]. +- [application-service]: stop generating the gitops repo content in response to [SEBs]. +- [application-service]: stop creating gitops repos. +- Drop the [Environment], [SnapshotEnvironmentBinding], [GitOpsDeploymentManagedEnvironment], + [DeploymentTarget], [DeploymentTargetClaim], and [DeploymentTargetClass] APIs from the + [application-api] repo. +- Stop deploying the [gitops-service] and decomission the RDS database. + +[Dynamic Resource Allocation APIs]: https://kubernetes.io/docs/concepts/scheduling-eviction/dynamic-resource-allocation/ +[renovatebot]: https://github.com/renovatebot/renovate +[deployment-target-operator]: # +[gitops-service]: ../ref/gitops-service.md +[push-to-registry]: https://github.com/redhat-appstudio/release-service-catalog/tree/main/pipelines/push-to-external-registry +[application-api]: https://github.com/redhat-appstudio/application-api +[application-service]: ../architecture/application-service.md +[integration-service]: ../architecture/integration-service.md +[release-service]: ../architecture/release-service.md +[Application]: ../ref/application-environment-api.md#application +[Applications]: ../ref/application-environment-api.md#application +[Component]: ../ref/application-environment-api.md#component +[Components]: ../ref/application-environment-api.md#component +[Environment]: ../ref/application-environment-api.md#environment +[Environments]: ../ref/application-environment-api.md#environment +[GitOpsDeploymentManagedEnvironment]: ../ref/application-environment-api.md#GitOpsDeploymentManagedEnvironment +[GitOpsDeploymentManagedEnvironments]: ../ref/application-environment-api.md#GitOpsDeploymentManagedEnvironment +[SnapshotEnvironmentBinding]: ../ref/application-environment-api.md#snapshotenvironmentbinding +[SnapshotEnvironmentBindings]: ../ref/application-environment-api.md#snapshotenvironmentbinding +[Snapshot]: ../ref/application-environment-api.md#snapshot +[Snapshots]: ../ref/application-environment-api.md#snapshot +[Release]: ../ref/release-service.md#Release +[Releases]: ../ref/release-service.md#Release +[ReleasePlan]: ../ref/release-service.md#ReleasePlan +[ReleasePlans]: ../ref/release-service.md#ReleasePlan +[ReleasePlanAdmission]: ../ref/release-service.md#ReleasePlanAdmission +[ReleasePlanAdmissions]: ../ref/release-service.md#ReleasePlanAdmission +[IntegrationTestScenario]: ../ref/integration-service.md#IntegrationTestScenario +[IntegrationTestScenarios]: ../ref/integration-service.md#IntegrationTestScenario +[DT]: ../ref/application-environment-api.md#deploymenttarget +[DTs]: ../ref/application-environment-api.md#deploymenttarget +[DeploymentTarget]: ../ref/application-environment-api.md#deploymenttarget +[DeploymentTargets]: ../ref/application-environment-api.md#deploymenttarget +[DTC]: ../ref/application-environment-api.md#deploymenttargetclaim +[DTCs]: ../ref/application-environment-api.md#deploymenttargetclaim +[DeploymentTargetClaim]: ../ref/application-environment-api.md#deploymenttargetclaim +[DeploymentTargetClaims]: ../ref/application-environment-api.md#deploymenttargetclaim +[DTCls]: ../ref/application-environment-api.md#deploymenttargetclass +[DTClses]: ../ref/application-environment-api.md#deploymenttargetclass +[DeploymentTargetClass]: ../ref/application-environment-api.md#deploymenttargetclass +[DeploymentTargetClasses]: ../ref/application-environment-api.md#deploymenttargetclass