Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segment replication cancellation of in-sync allocation ids in ReplicationTracker #7213

Closed
Tracked by #6761
dreamer-89 opened this issue Apr 18, 2023 · 1 comment
Closed
Tracked by #6761
Labels
bug Something isn't working distributed framework

Comments

@dreamer-89
Copy link
Member

dreamer-89 commented Apr 18, 2023

Coming from #6761 exercise, RecoveryWhileUnderLoadIT.testRecoverWhileRelocating fails due to missing replication files from target store, failing assertion. This happens due to cancellation of segment replication, followed by replica shard failure, where engine's close action clears the temporary replication files.

Existing cancellation logic works on routing table changes and cancels segment replication for allocation ids which are not part of cluster metadata's in-sync ids. This can be problematic for cases where shard allocation just marked in-sync is not yet applied on cluster state.

@DarshitChanpura
Copy link
Member

@dreamer-89 Closing this issue as it was fixed via #7214 and backported to 2.x line via #7220 & #7221. Feel free to re-open it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working distributed framework
Projects
None yet
Development

No branches or pull requests

3 participants