[Segment Replication] Differentiate post merge checkpoints #10872
Labels
enhancement
Enhancement or improvement to existing feature or request
Indexing:Replication
Issues and PRs related to core replication framework eg segrep
Is your feature request related to a problem? Please describe.
Today SegRep uses the ReplicationCheckpoint to compute staleness of replicas, report stats, and enforce backpressure for lagging replicas. It would be great to differentiate in metrics & exclude from backpressure computations the lag between checkpoints that are strictly post-merge refreshes where the searchable doc count does not change.
Describe the solution you'd like
While this data should still be surfaced it should be excluded from backpressure computations.
/_cat/segment_replication should still show ongoing lag but have a separate column to identify the lag for syncing to a merged checkpoint.
This should also be surfaced through _nodes/stats API as a separate metric for both bytes & time lag.
Perhaps we can restructure the cat SR api to show a row for each checkpoint the replica is lagging and a isMerge column.
Describe alternatives you've considered
Leaving as is / silently excluding these checkpoints.
The text was updated successfully, but these errors were encountered: