camunda · conceptualshark · Aug 30, 2024 · Aug 22, 2024 · Aug 23, 2024 · Aug 29, 2024
diff --git a/docs/self-managed/concepts/multi-region/dual-region.md b/docs/self-managed/concepts/multi-region/dual-region.md
@@ -54,8 +54,6 @@ The currently supported Camunda 8 Self-Managed components are:
 
 The overall system is **active-passive**, even though some components may be **active-active**. You will have to take care of the user traffic routing or DNS by yourself, and won't be considered further. Select one region as the actively serving region and route the user traffic there. In case of a total region failure, route the traffic to the passive region yourself.
 
-<!-- Should we provide some reading materials on how to tackle this? -->
-
 ### Components
 
 #### Zeebe
@@ -129,11 +127,8 @@ In the event of a total active region loss, the following data will be lost:
   - Role Based Access Control (RBAC) does not work.
 - Optimize is not supported.
   - This is due to Optimize depending on Identity to work.
-- Connectors are not supported.
-  - This is due to Connectors depending on Operate to work for inbound Connectors and potentially resulting in race condition.
-- During the failback procedure, there’s a small chance that some data will be lost in Elasticsearch affecting Operate and Tasklist.
-  - This **does not** affect the processing of process instances in any way. The impact is that some information about the affected instances might not be visible in Operate and Tasklist.
-  - This is further explained in the [operational procedure](./../../operational-guides/multi-region/dual-region-ops.md?failback=step2#failback) during the relevant step.
+- Connectors can be deployed alongside but ensure to understand idempotency based on [the described documentation](../../../components/connectors/use-connectors/inbound.md#creating-the-connector-event).
+  - in a dual-region setup, you'll have two connector deployments and using message idempotency is of importance to not duplicate events.
 - Zeebe cluster scaling is not supported.
 - Web-Modeler is a standalone component and is not covered in this guide.
   - Modeling applications can operate independently outside of the automation clusters.
@@ -194,14 +189,13 @@ The **Recovery Point Objective (RPO)** is the maximum tolerable data loss measur
 
 The **Recovery Time Objective (RTO)** is the time to restore services to a functional state.
 
-For Zeebe the **RPO** is **0**.
-
-For Operate and Tasklist the **RPO** is close to **0** for critical data due to the previously mentioned small chance of data loss in Elasticsearch during the failback procedure.
+For Operate, Tasklist, and Zeebe the **RPO** is **0**.
 
 The **RTO** can be considered for the **failover** and **failback** procedures, both resulting in a functional state.
 
-- **failover** has an **RTO** of **15-20** minutes to restore a functional state, excluding DNS considerations.
-- **failback** has an **RTO** of **25-30 + X** minutes to restore a functional state. Where X is the time it takes to back up and restore Elasticsearch, which is highly dependent on the setup and chosen [Elasticsearch backup type](https://www.elastic.co/guide/en/elasticsearch/reference/current/snapshots-register-repository.html#ess-repo-types).
+- **failover** has an **RTO** of **< 1** minute to restore a functional state, excluding DNS considerations.
+- **failback** has an **RTO** of **5 + X** minutes to restore a functional state, where X is the time it takes to back up and restore Elasticsearch. This timing is highly dependent on the setup and chosen [Elasticsearch backup type](https://www.elastic.co/guide/en/elasticsearch/reference/current/snapshots-register-repository.html#ess-repo-types).
+  During our automated tests, the reinstallation and reconfiguration of Camunda 8 takes 5 minutes. This can serve as a general guideline for the time required, though your experience may vary depending on your available resources and familiarity with the operational procedure.
 
 :::info
 

diff --git a/docs/self-managed/operational-guides/multi-region/dual-region-ops.md b/docs/self-managed/operational-guides/multi-region/dual-region-ops.md
diff --git a/docs/self-managed/operational-guides/multi-region/img/10.svg b/docs/self-managed/operational-guides/multi-region/img/10.svg
diff --git a/docs/self-managed/operational-guides/multi-region/img/11.svg b/docs/self-managed/operational-guides/multi-region/img/11.svg
diff --git a/docs/self-managed/operational-guides/multi-region/img/12.svg b/docs/self-managed/operational-guides/multi-region/img/12.svg
diff --git a/docs/self-managed/operational-guides/multi-region/img/13.svg b/docs/self-managed/operational-guides/multi-region/img/13.svg
diff --git a/docs/self-managed/operational-guides/multi-region/img/14.svg b/docs/self-managed/operational-guides/multi-region/img/14.svg
diff --git a/docs/self-managed/operational-guides/multi-region/img/15.svg b/docs/self-managed/operational-guides/multi-region/img/15.svg
diff --git a/docs/self-managed/operational-guides/multi-region/img/3.svg b/docs/self-managed/operational-guides/multi-region/img/3.svg
diff --git a/docs/self-managed/operational-guides/multi-region/img/4.svg b/docs/self-managed/operational-guides/multi-region/img/4.svg
diff --git a/docs/self-managed/operational-guides/multi-region/img/5.svg b/docs/self-managed/operational-guides/multi-region/img/5.svg
diff --git a/docs/self-managed/operational-guides/multi-region/img/6.svg b/docs/self-managed/operational-guides/multi-region/img/6.svg
diff --git a/docs/self-managed/operational-guides/multi-region/img/7.svg b/docs/self-managed/operational-guides/multi-region/img/7.svg
diff --git a/docs/self-managed/operational-guides/multi-region/img/8.svg b/docs/self-managed/operational-guides/multi-region/img/8.svg
diff --git a/docs/self-managed/operational-guides/multi-region/img/9.svg b/docs/self-managed/operational-guides/multi-region/img/9.svg
diff --git a/docs/self-managed/setup/deploy/amazon/amazon-eks/dual-region.md b/docs/self-managed/setup/deploy/amazon/amazon-eks/dual-region.md
@@ -4,7 +4,7 @@ title: "Dual-region setup (EKS)"
 description: "Deploy two Amazon Kubernetes (EKS) clusters with Terraform for a peered setup allowing dual-region communication."
 ---
 
-<!-- Image source: https://docs.google.com/presentation/d/1mbEIc0KuumQCYeg1YMpvdVR8AEUcbTWqlesX-IxVIjY/edit?usp=sharing -->
+<!-- Image source: https://docs.google.com/presentation/d/1w1KUsvx4r6RS7DAozx6X65BtLJcIxU6ve_y3bYFcfYk/edit?usp=sharing -->
 
 import CoreDNSKubeDNS from "./assets/core-dns-kube-dns.svg"
 
@@ -22,8 +22,8 @@ This guide requires you to have previously completed or reviewed the steps taken
 
 - An [AWS account](https://docs.aws.amazon.com/accounts/latest/reference/accounts-welcome.html) to create resources within AWS.
 - [Helm (3.x)](https://helm.sh/docs/intro/install/) for installing and upgrading the [Camunda Helm chart](https://github.com/camunda/camunda-platform-helm).
-- [Kubectl (1.28.x)](https://kubernetes.io/docs/tasks/tools/#kubectl) to interact with the cluster.
-- [Terraform (1.7.x)](https://developer.hashicorp.com/terraform/downloads)
+- [Kubectl (1.30.x)](https://kubernetes.io/docs/tasks/tools/#kubectl) to interact with the cluster.
+- [Terraform (1.9.x)](https://developer.hashicorp.com/terraform/downloads)
 
 ## Considerations
 
@@ -69,8 +69,6 @@ You have to choose unique namespaces for Camunda 8 installations. The namespace
 For example, you can install Camunda 8 into `CAMUNDA_NAMESPACE_0` in `CLUSTER_0`, and `CAMUNDA_NAMESPACE_1` on the `CLUSTER_1`, where `CAMUNDA_NAMESPACE_0` != `CAMUNDA_NAMESPACE_1`.
 Using the same namespace names on both clusters won't work as CoreDNS won't be able to distinguish between traffic targeted at the local and remote cluster.
 
-In addition to namespaces for Camunda installations, create the namespaces for failover (`CAMUNDA_NAMESPACE_0_FAILOVER` in `CLUSTER_0` and `CAMUNDA_NAMESPACE_1_FAILOVER` in `CLUSTER_1`), for the case of a total region loss. This is for completeness, so you don't forget to add the mapping on region recovery. The operational procedure is handled in a different [document on dual-region](./../../../../operational-guides/multi-region/dual-region-ops.md).
-
 :::
 
 4. Execute the script via the following command:
@@ -259,13 +257,6 @@ kubectl --context cluster-london -n kube-system edit configmap coredns
             force_tcp
         }
     }
-    camunda-paris-failover.svc.cluster.local:53 {
-        errors
-        cache 30
-        forward . 10.202.19.54 10.202.53.21 10.202.84.222 {
-            force_tcp
-        }
-    }
 ### Cluster 0 - End ###
 
 Please copy the following between
@@ -282,13 +273,6 @@ kubectl --context cluster-paris -n kube-system edit configmap coredns
             force_tcp
         }
     }
-    camunda-london-failover.svc.cluster.local:53 {
-        errors
-        cache 30
-        forward . 10.192.27.56 10.192.84.117 10.192.36.238 {
-            force_tcp
-        }
-    }
 ### Cluster 1 - End ###
 ```
 
@@ -340,13 +324,6 @@ data:
             force_tcp
         }
     }
-    camunda-paris-failover.svc.cluster.local:53 {
-        errors
-        cache 30
-        forward . 10.202.19.54 10.202.53.21 10.202.84.222 {
-            force_tcp
-        }
-    }
 ```
 
   </summary>
@@ -375,7 +352,7 @@ The script [test_dns_chaining.sh](https://github.com/camunda/c8-multi-region/blo
 
 ### Create the secret for Elasticsearch
 
-Elasticsearch will need an S3 bucket for data backup and restore procedure, required during a regional failover. For this, you will need to configure a Kubernetes secret to not expose those in cleartext.
+Elasticsearch will need an S3 bucket for data backup and restore procedure, required during a regional failback. For this, you will need to configure a Kubernetes secret to not expose those in cleartext.
 
 You can pull the data from Terraform since you exposed those via `output.tf`.
 

diff --git a/versioned_docs/version-8.5/self-managed/concepts/multi-region/dual-region.md b/versioned_docs/version-8.5/self-managed/concepts/multi-region/dual-region.md
@@ -129,8 +129,8 @@ In the event of a total active region loss, the following data will be lost:
   - Role Based Access Control (RBAC) does not work.
 - Optimize is not supported.
   - This is due to Optimize depending on Identity to work.
-- Connectors are not supported.
-  - This is due to Connectors depending on Operate to work for inbound Connectors and potentially resulting in race condition.
+- Connectors can be deployed alongside but ensure to understand idempotency based on [the described documentation](../../../components/connectors/use-connectors/inbound.md#creating-the-connector-event).
+  - in a dual-region setup, you'll have two connector deployments and using message idempotency is of importance to not duplicate events.
 - During the failback procedure, there’s a small chance that some data will be lost in Elasticsearch affecting Operate and Tasklist.
   - This **does not** affect the processing of process instances in any way. The impact is that some information about the affected instances might not be visible in Operate and Tasklist.
   - This is further explained in the [operational procedure](./../../operational-guides/multi-region/dual-region-ops.md?failback=step2#failback) during the relevant step.

diff --git a/...ocs/version-8.5/self-managed/operational-guides/multi-region/dual-region-ops.md b/...ocs/version-8.5/self-managed/operational-guides/multi-region/dual-region-ops.md
@@ -149,6 +149,8 @@ One of the regions is lost, meaning Zeebe:
 
 For the failover procedure, ensure the lost region does not accidentally reconnect. You should be sure it is lost, and if so, look into measures to prevent it from reconnecting. For example, by utilizing the suggested solution below to isolate your active environment.
 
+It's crucial to ensure the isolation of the environments because, during the operational procedure, we will have duplicate Zeebe broker IDs, which would collide if not correctly isolated and if the other region came accidentally on again.
+
 #### How to get there
 
 Depending on your architecture, possible approaches are:
@@ -585,7 +587,7 @@ kubectl --context $CLUSTER_SURVIVING scale -n $CAMUNDA_NAMESPACE_SURVIVING deplo
 kubectl --context $CLUSTER_SURVIVING scale -n $CAMUNDA_NAMESPACE_SURVIVING deployments/$HELM_RELEASE_NAME-tasklist --replicas 0
 ```
 
-2. Disable the Zeebe Elasticsearch exporters in Zeebe via kubectl:
+2. Disable the Zeebe Elasticsearch exporters in Zeebe via kubectl using the [exporting API](./../../zeebe-deployment/operations/management-api.md#exporting-api):
 
 ```bash
 kubectl --context $CLUSTER_SURVIVING port-forward services/$HELM_RELEASE_NAME-zeebe-gateway 9600:9600 -n $CAMUNDA_NAMESPACE_SURVIVING

diff --git a/...ned_docs/version-8.5/self-managed/setup/deploy/amazon/amazon-eks/dual-region.md b/...ned_docs/version-8.5/self-managed/setup/deploy/amazon/amazon-eks/dual-region.md
@@ -226,10 +226,10 @@ kubectl --context $CLUSTER_0 apply -f https://raw.githubusercontent.com/camunda/
 kubectl --context $CLUSTER_1 apply -f https://raw.githubusercontent.com/camunda/c8-multi-region/main/aws/dual-region/kubernetes/internal-dns-lb.yml
 ```
 
-2. Execute the script [generate_core_dns_entry.sh](https://github.com/camunda/c8-multi-region/blob/main/aws/dual-region/scripts/generate_core_dns_entry.sh) in the folder `aws/dual-region/scripts/` of the repository to help you generate the CoreDNS config. Make sure that you have previously exported the [environment prerequisites](#environment-prerequisites) since the script builds on top of it.
+2. Execute the script [generate_core_dns_entry.sh](https://github.com/camunda/c8-multi-region/blob/main/aws/dual-region/scripts/generate_core_dns_entry.sh) with the parameter `legacy` in the folder `aws/dual-region/scripts/` of the repository to help you generate the CoreDNS config. Make sure that you have previously exported the [environment prerequisites](#environment-prerequisites) since the script builds on top of it.
 
 ```shell
-./generate_core_dns_entry.sh
+./generate_core_dns_entry.sh legacy
 ```
 
 3. The script will retrieve the IPs of the load balancer via the AWS CLI and return the required config change.
@@ -244,7 +244,7 @@ For illustration purposes only. These values will not work in your environment.
 :::
 
 ```shell
-./generate_core_dns_entry.sh
+./generate_core_dns_entry.sh legacy
 Please copy the following between
 ### Cluster 0 - Start ### and ### Cluster 0 - End ###
 and insert it at the end of your CoreDNS configmap in Cluster 0
@@ -375,7 +375,7 @@ The script [test_dns_chaining.sh](https://github.com/camunda/c8-multi-region/blo
 
 ### Create the secret for Elasticsearch
 
-Elasticsearch will need an S3 bucket for data backup and restore procedure, required during a regional failover. For this, you will need to configure a Kubernetes secret to not expose those in cleartext.
+Elasticsearch will need an S3 bucket for data backup and restore procedure, required during a regional failback. For this, you will need to configure a Kubernetes secret to not expose those in cleartext.
 
 You can pull the data from Terraform since you exposed those via `output.tf`.