You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When scaling up a CassandraDatacenter (e.g., from 9 to 18 nodes) using the cass-operator, the operator first updates the StatefulSets and waits for the scale-up operation to complete. However, the PodDisruptionBudget minAvailable field is only updated after the scaling operation finishes.
This behaviour introduces a critical issue:
During the scaling process, while the PDB minAvailable is still set to the previous value (e.g., 8), there are now 18 pods in the cluster. If pods go down during this time due to node disruptions, this could lead to multiple pod losses across different Availability Zones, severely degrading the cluster's availability.
In contrast, scaling down (e.g., from 18 to 9) is handled properly, as the PDB minAvailable remains at 17 until the scale-down operation completes. This ensures adequate availability and avoids the described issue.
CassandraDatacenter events
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal ScalingUpRack 10m cass-operator Scaling up rack ra
Normal ScalingUpRack 10m cass-operator Scaling up rack rb
Normal ScalingUpRack 10m cass-operator Scaling up rack rc
Normal StartingCassandra 9m22s cass-operator Starting Cassandra for pod othmane-test-dc1-aa32-ra-sts-1
Normal StartedCassandra 6m52s cass-operator Started Cassandra for pod othmane-test-dc1-aa32-ra-sts-1
Normal StartingCassandra 6m51s cass-operator Starting Cassandra for pod othmane-test-dc1-aa32-rb-sts-1
Normal StartedCassandra 4m27s cass-operator Started Cassandra for pod othmane-test-dc1-aa32-rb-sts-1
Normal StartingCassandra 4m26s cass-operator Starting Cassandra for pod othmane-test-dc1-aa32-rc-sts-1
Normal StartedCassandra 2m cass-operator Started Cassandra for pod othmane-test-dc1-aa32-rc-sts-1
Normal CreatedResource 119s cass-operator Created PodDisruptionBudget dc1-aa32-pdb
From the events logged in the CassandraDatacenterCR, it is clear that the PDB recreation is the last action performed. To address this, I propose the following:
Rearrange the Order of Checks: Update the reconciliation logic so that the PDB is recreated (with the updated minAvailable value) before the scale-up check is performed.
This can be implemented by modifying the order in the reconciliation loop:
Perform the scale-up operation after the PDB update, at this step.
Retain Current Order for Scale-Downs: In the case of a scale-down, the current behavior is correct and should remain unchanged. The PDB update should continue to occur after the scale-down to ensure that minAvailable is appropriately decremented only after the reduction in nodes is complete.
I’d be happy to investigate further and contribute to implementing this solution. Let me know your thoughts on this approach or if there are additional considerations to take into account.
What did you expect to happen?
The PDB minAvailable field should be updated before the StatefulSets update during a scale-up operation. This would prevent a scenario where the PDB minAvailable value is too low to protect the cluster's availability during scaling.
How can we reproduce it (as minimally and precisely as possible)?
Scale a CassandraDatacenter up.
Observe that the STS is updated and new pods are created.
Note that the PDB minAvailable is not updated until the scale-up is complete.
Simulate pod failures during the scale-up process.
cass-operator version
1.22.0
Kubernetes version
1.31.2
Method of installation
No response
Anything else we need to know?
No response
┆Issue is synchronized with this Jira Story by Unito
┆Issue Number: CASS-83
The text was updated successfully, but these errors were encountered:
What happened?
When scaling up a CassandraDatacenter (e.g., from 9 to 18 nodes) using the cass-operator, the operator first updates the StatefulSets and waits for the scale-up operation to complete. However, the PodDisruptionBudget minAvailable field is only updated after the scaling operation finishes.
This behaviour introduces a critical issue:
During the scaling process, while the PDB minAvailable is still set to the previous value (e.g., 8), there are now 18 pods in the cluster. If pods go down during this time due to node disruptions, this could lead to multiple pod losses across different Availability Zones, severely degrading the cluster's availability.
In contrast, scaling down (e.g., from 18 to 9) is handled properly, as the PDB minAvailable remains at 17 until the scale-down operation completes. This ensures adequate availability and avoids the described issue.
CassandraDatacenter events
From the events logged in the CassandraDatacenterCR, it is clear that the PDB recreation is the last action performed. To address this, I propose the following:
Rearrange the Order of Checks: Update the reconciliation logic so that the PDB is recreated (with the updated minAvailable value) before the scale-up check is performed.
This can be implemented by modifying the order in the reconciliation loop:
I’d be happy to investigate further and contribute to implementing this solution. Let me know your thoughts on this approach or if there are additional considerations to take into account.
What did you expect to happen?
The PDB minAvailable field should be updated before the StatefulSets update during a scale-up operation. This would prevent a scenario where the PDB minAvailable value is too low to protect the cluster's availability during scaling.
How can we reproduce it (as minimally and precisely as possible)?
Scale a CassandraDatacenter up.
Observe that the STS is updated and new pods are created.
Note that the PDB minAvailable is not updated until the scale-up is complete.
Simulate pod failures during the scale-up process.
cass-operator version
1.22.0
Kubernetes version
1.31.2
Method of installation
No response
Anything else we need to know?
No response
┆Issue is synchronized with this Jira Story by Unito
┆Issue Number: CASS-83
The text was updated successfully, but these errors were encountered: