Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Election loop happens when there are many cluster manager nodes and cluster state size is large #11685

Closed
soosinha opened this issue Dec 28, 2023 · 0 comments
Assignees
Labels
bug Something isn't working Cluster Manager

Comments

@soosinha
Copy link
Member

Describe the bug

When there are too many cluster manager nodes (5 and above) and the size of cluster state is large, if the active cluster manager node drops due to any reason, the cluster gets stuck in an election loop.
Explanation: When there are 5 cluster manager nodes, if the active cluster manager drops, the remaining 4 cluster manager nodes start the election within 100ms (initial timeout) of each other. When the first election manages to get the quorum, it sets the node as leader and cancels the election scheduler. While the cluster state is being computed, the next election also succeeds as there are still 3 remaining nodes which can vote for a quorum. This next election increments the term due to which the previously elected leader steps down again and restarts the election again. This cycle gets repeated without any backoff as a new election scheduler is created every time.

Related component

Cluster Manager

To Reproduce

  1. Create a cluster with 5 cluster manager nodes
  2. Create a lot of indices so that the cluster state computation increases to around 1 sec.
  3. Kill the active cluster manager node
  4. See error of the remaining cluster manager nodes where the election succeeds but the cluster state publication fails
FailedToCommitClusterStateException[node is no longer cluster-manager for term 673682 while handling publication]
        at org.opensearch.cluster.coordination.Coordinator.publish(Coordinator.java:1280)

Expected behavior

When the active cluster manager node drops, the next election should happen immediately. In case the election does not succeed in publishing the cluster state, it should not cancel the election scheduler but start the next election after a backoff time.

Additional Details

Plugins
Please list all plugins currently enabled.

Screenshots
If applicable, add screenshots to help explain your problem.

Host/Environment (please complete the following information):
OS 2.11

Additional context
Add any other context about the problem here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Cluster Manager
Projects
None yet
Development

No branches or pull requests

1 participant