-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Segment Replication] Adding PrimaryMode check before publishing checkpoint and processing a received checkpoint. #4157
Changes from 8 commits
e0dcca6
c048700
31db9c7
a7dfbb8
fde5a61
9f2be06
7519ba7
226f1c0
66fbfcb
4989d2b
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -1434,6 +1434,14 @@ public final boolean shouldProcessCheckpoint(ReplicationCheckpoint requestCheckp | |
logger.trace(() -> new ParameterizedMessage("Ignoring new replication checkpoint - shard is not started {}", state())); | ||
return false; | ||
} | ||
if (getReplicationTracker().isPrimaryMode()) { | ||
logger.trace( | ||
() -> new ParameterizedMessage( | ||
"Ignoring new replication checkpoint - shard is in primaryMode and cannot receive any checkpoints." | ||
) | ||
); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I am concerned about the silent logging, ideally this should be rare, if we are seeing this quite often, it might reflect a bug. Let log with a WARN level There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sure, I will change this to warn level. |
||
return false; | ||
} | ||
ReplicationCheckpoint localCheckpoint = getLatestReplicationCheckpoint(); | ||
if (localCheckpoint.isAheadOf(requestCheckpoint)) { | ||
logger.trace( | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we also reject checkpoints if the replica copy(where the checkpoint was sent to) is operating in primary mode.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also on that note - we will want to cancel any ongoing replication events on both sides. I've added #4136 to cover this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it also worth checking shardRouting here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added Logic to reject checkpoints if shard is in PrimaryMode in latest commit.