Skip to content
This repository has been archived by the owner on Dec 17, 2018. It is now read-only.

Duplicate heartbeats may prevent a Raft follower from timing out a Raft leader #16

Open
allengeorge opened this issue Nov 25, 2013 · 0 comments
Assignees

Comments

@allengeorge
Copy link
Owner

If you have a really poor network that duplicates a lot of packets (and specifically, heartbeat packets) it's possible for a follower to believe that it's still in communication with a leader. This is because a heartbeat packet does not contain any information that would move time forward. This could mean that a leader failure goes undetected, and could prevent the Raft cluster from making progress.

This is highly unlikely in practice. It's much more likely that the network will drop packets, not duplicate them. Moreover, even a few duplicates don't matter: what matters is that the duplicates continue, which is unlikely.

That said, this should be mitigated. One solution would be to use periodic NOOPs instead of heartbeats to verify that the leader is still alive.

@ghost ghost assigned allengeorge Nov 25, 2013
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant