From 712081197102ef37f3aa3ab1ad1a3f883f9fb232 Mon Sep 17 00:00:00 2001
From: Chiman Jain <chimanjain15@gmail.com>
Date: Tue, 26 Nov 2024 13:11:31 +0530
Subject: [PATCH] add troubleshooting guide for synciq error

---
 content/docs/replication/troubleshooting.md | 13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)
diff --git a/content/docs/replication/troubleshooting.md b/content/docs/replication/troubleshooting.md
index 325e9459c5..391d744e24 100644
--- a/content/docs/replication/troubleshooting.md
+++ b/content/docs/replication/troubleshooting.md
@@ -7,15 +7,16 @@ description: >
 ---
 
 | Symptoms | Prevention, Resolution or Workaround |
-| --- | --- | 
+| --- | --- |
 | Persistent volumes don't get created on the target cluster. |  Run `kubectl describe` on one of the pods of replication controller and see if event says `Config update won't be applied because of invalid configmap/secrets. Please fix the invalid configuration`. If it does, then ensure you correctly populated replication ConfigMap. You can check the current status by running `kubectl describe cm -n dell-replication-controller dell-replication-controller-config`. If ConfigMap is empty, please edit it yourself or use `repctl cluster inject` command. |
-| Persistent volumes don't get created on the target cluster. You don't see any events on the replication-controller pod. | Check logs of replication controller by running `kubectl logs -n dell-replication-controller dell-replication-controller-manager-<generated-symbols>`. If you see `clusterId - <clusterID> not found` errors then be sure to check if you specified the same clusterIDs in both your ConfigMap and replication enabled StorageClass. | 
-| You apply replication action by manually editing ReplicationGroup resource field `spec.action` and don't see any change of ReplicationGroup state after a while.  | Check events of the replication-controller pod, if it says `Cannot proceed with action <your-action>. [unsupported action]` then check spelling of your action and consult the [Replication Actions](../replication-actions) page. Alternatively, you can use `repctl` instead of manually editing ReplicationGroup resources. | 
-| You execute failover action using `repctl failover` command and see `failover: error executing failover to source site`. | This means you tried to failover to a cluster that is already marked source. If you still want to execute failover for RG, just choose another cluster. | 
-| You've created PersistentVolumeClaim using replication enabled StorageClass but don't see any RGs created in the source cluster. | Check annotations of created PersistentVolumeClaim. If it doesn't have `annotations` that start with `replication.storage.dell.com` then please wait for a couple of minutes for them to be added and RG to be created. | 
+| Persistent volumes don't get created on the target cluster. You don't see any events on the replication-controller pod. | Check logs of replication controller by running `kubectl logs -n dell-replication-controller dell-replication-controller-manager-<generated-symbols>`. If you see `clusterId - <clusterID> not found` errors then be sure to check if you specified the same clusterIDs in both your ConfigMap and replication enabled StorageClass. |
+| You apply replication action by manually editing ReplicationGroup resource field `spec.action` and don't see any change of ReplicationGroup state after a while.  | Check events of the replication-controller pod, if it says `Cannot proceed with action <your-action>. [unsupported action]` then check spelling of your action and consult the [Replication Actions](../replication-actions) page. Alternatively, you can use `repctl` instead of manually editing ReplicationGroup resources. |
+| You execute failover action using `repctl failover` command and see `failover: error executing failover to source site`. | This means you tried to failover to a cluster that is already marked source. If you still want to execute failover for RG, just choose another cluster. |
+| You've created PersistentVolumeClaim using replication enabled StorageClass but don't see any RGs created in the source cluster. | Check annotations of created PersistentVolumeClaim. If it doesn't have `annotations` that start with `replication.storage.dell.com` then please wait for a couple of minutes for them to be added and RG to be created. |
 | When installing common replication controller using helm you see an error that states `invalid ownership metadata` and `missing key "app.kubernetes.io/managed-by": must be set to "Helm"` | This means that you haven't fully deleted the previous release, you can fix it by either deleting entire manifest by using `kubectl delete -f deploy/controller.yaml` or manually deleting conflicting resources (ClusterRoles, ClusterRoleBinding, etc.) |
 | PV and/or PVCs are not being created at the source/target cluster. If you check the controller's logs you can see `no such host` errors| Make sure cluster-1's API is pingable from cluster-2 and vice versa. If one of your clusters is OpenShift located in a private network and needs records in /etc/hosts, `exec` into controller pod and modify `/etc/hosts` manually. |
 | After upgrading to Replication v1.4.0, if `kubectl get rg` returns an error `Unable to list "replication.storage.dell.com/v1alpha1, Resource=dellcsireplicationgroups"`| This means `kubectl` still doesn't recognize the new version of CRD `dellcsireplicationgroups.replication.storage.dell.com` after upgrade. Running the command `kubectl get DellCSIReplicationGroup.v1.replication.storage.dell.com/<rg-id> -o yaml` will resolve the issue. |
 | To add or delete PV s in the existing SYNC Replication Group in PowerStore, you may encounter the error `The operation is restricted as sync replication session for resource <Replication Group Name> is not paused` | To resolve this, you need to pause the replication group, add the PV, and then resume the replication group (RG). The commands for the pause and resume operations are: `repctl --rg <rg-id> exec -a suspend`  `repctl --rg <rg-id> exec -a resume` |
-| To delete the last volume from the existing SYNC Replication Group in Powerstore, you may encounter the error 'failed to remove volume from volume group: The operation cannot be completed on metro or replicated volume group because volume group will become empty after last members are removed' | To resolve this, unassign the protection policy from the corresponding volume group on the PowerStore Manager UI. After that, you can successfully delete the last volume in that SYNC Replication Group.| 
+| To delete the last volume from the existing SYNC Replication Group in Powerstore, you may encounter the error 'failed to remove volume from volume group: The operation cannot be completed on metro or replicated volume group because volume group will become empty after last members are removed' | To resolve this, unassign the protection policy from the corresponding volume group on the PowerStore Manager UI. After that, you can successfully delete the last volume in that SYNC Replication Group.|
 | When running CSI-PowerMax with Replication in a multi-cluster configuration, the driver on the target cluster fails and the following error is seen in logs: `error="CSI reverseproxy service host or port not found, CSI reverseproxy not installed properly"` | The reverseproxy service needs to be created manually on the target cluster. Follow [the instructions here](../../deployment/csmoperator/modules/replication#configuration-steps) to create it.|
+| When getting the following error for CSi-Powerscale with Replication with encryption enabled: `SyncIQ policy failed to establish an encrypted connection`, the Replication groups and PVC's won't be created at target cluster. | The `encryption required` flag in the SyncIQ settings was set to "yes" by default in OneFS 9.0+. To rectify this error, please follow the following article: <https://www.dell.com/support/kbdoc/en-us/000215174/isilon-synciq-9-0-all-policies-fail-when-source-or-target-cluster-is-on-onefs-9-0-with-no-node-on-source-cluster-was-able-to-connect-to-target-cluster> |