-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC] Existing Cluster Migration to/from Remote Store - HLD #12246
Comments
Requesting feedback from @shwetathareja , @andrross , @mch2 , @ankitkala , @itiyamas . |
Thanks for getting this going @gbbafna A few thoughts...
All of the shards on the restarted node would then move to remote-backed? Wouldn't the replicas need to ensure their correlating primary has moved first? Or are all primaries updated without restarts?
|
Thanks @mch2 for reviewing .
User has to do via a cluster settings update call . This has to be done after at least one node restart . Otherwise the direction set to
There is hard requirement of restart, as we need to update the yml settings for the node.
|
Aim
To support migration of existing Doc Rep cluster to/from Remote backed cluster which has Remote backed Segment , and Translog enabled.
RFC : #7986
Tenets
Background & Terminology
Remote Store - https://opensearch.org/docs/latest/tuning-your-cluster/availability-and-recovery/remote-store/index/
Segment Replication - https://opensearch.org/docs/latest/tuning-your-cluster/availability-and-recovery/segment-replication/index/
What’s not supported/covered ?
Migration involving Segment Replication enabled indices w/o remote store is not currently scoped and will be considered as a follow up .
Pre Requisites
Approaches
#7986
Rolling restarts of the all the nodes
The migration process starts with updation of cluster settings to mixed mode with eventual direction as remote . This will be followed by rolling restarts of all the nodes.
DocRep to Remote Backed
Primary shard copies will be starting/relocating to remote-backed mode. Replica copy of a shard will only move to remote after primary copy movement. This will be ensured by the cluster manager. It will start operating in bilingual mode - indexing documents in doc rep replicas, uploading segments/translog to remote storage and publishing checkpoints to remote replicas . Primary shard will use node attribute to find the state of each shard copy. Whenever the primary shard starts , it will need to turn on the bilingual mode.
As we restart each nodes, the new primaries followed by replica copies will start to operate in remote-backed mode. The primary will be indexing documents to older docrep replicas as well . When all the shards of an index has migrated to remote-backed mode, we will to update the index metadata to mark the index as a remote backed index .
We will need to add the direction of the migration in cluster setting . This will be used to determine the allocation of the new indices . This ensures our migration process always moves forward as it will not increase the migration work once started .
Using node attributes also helps in modelling the remote store migration like a version upgrade process where in once we migrate the shards to remote backed node, we are not moving it back to doc rep based node.
Remote Backed to DocRep
Replica shard copies will start relocating to docrep nodes. Primary copy of a shard will only move to docrep after all replica copy movement. This will be ensured by the cluster manager. Between this the primary shard on remote nodes operates in bilingual mode - indexing documents in doc rep replicas, uploading segments/translog to remote storage and publishing checkpoints to remote replicas . Primary shard will use node attribute to find the state of each shard copy. Whenever the primary shard starts , it will need to turn on the bilingual mode.
User Story for docrep to remote migration
User Story for remote to docrep migration
Lifecycle of primary and replica shards for DocRep to Remote
After we set direction to remote , new shards would only come up in remote mode .
Primary shards will get started/relocated on the remote nodes . We will start the primary shard on remote node when remote store is in sync . The replica shards will only start on remote nodes, after primary shard has started on remote nodes.
After primary shard is migrated to remote , new replica copies will start/relocate in/to remote nodes as well . The new copy will take some time to come up as it has to download all the data afresh. The read throughput can decrease during migration as all the replicas will need some time to come up .
Dual Mode : When the primary shard has uploaded . From now on , it persists data on RemoteStore. But since it can have older replicas on docrep nodes, it will need to send over documents to those shards. This primary shard can only move to doc-rep mode, when it fails over and there is no remote shard available, but only a doc rep replica shard.
Replica on node restart : If primary is remote , and the node is remote : Then the shard can hydrate from remote , as part of RemoteIndexRecovery
When all shards of index moves to remote backed, we can set the index settings to be remote enabled. We can reload the engines of all primary nodes to switch off dual mode as well. This needs a deeper thought though.
When all nodes have been restarted , we need to disable the migrating mode to false.
Lifecycle of primary and replica shards for Remote to DocRep
The main difference is that primary shard will be last to move to docrep nodes. Unlike migration to remote store, migration back to docrep nodes will have no downtime
Components
Cluster Manager
Shard Allocation Deciders
Primary Promotion
Indexing
In mixed mode, when the primary is on remote-backed node, replicas can be on both remote and docrep nodes. In that case the primary needs to suppy documents to docrep nodes and publish checkpoints to remote nodes. We call this Dual Replication mode, where remote backed primary shard is able to take care of both docrep replicas and remote replicas. When all the shard copies of a give shard moves to remote-backed/docrep , the Dual Replication mode would be turned off.
We need to evaluate on each action : the impact on remote primary , receivers (doc rep replica and remote replica) and remote primary receiving success/failure from the receiver for the duration of migration . This would be covered in depth in Dual Replication LLD.
Reference : #5033
Primary - Primary Relocation
Doc Rep → Remote
The remote relocation will not complete all the data has been uploaded to remote in sync . The remote upload shouldn’t block writes .
Doc Rep → Doc Rep
Status Quo
Remote → Remote
We should relocate remote primary shard to another remote node just like a remote backed shard . We need to make sure that the doc rep replicas continue to receive all documents from both of the replicas and doesn’t have any holes .
Remote → DocRep
We should relocate remote primary shard to doc node just like a docrep backed shard .
Failover
A primary shard backed by remote store with at least one replica on remote node , the failover would be like a remote shard.
A primary shard backed by remote store with all replicas on docrep , the shard will move back to docrep node.
If the shard is still migrating to remote store and the primary fails, the migration process will restart on a new remote node
Supporting References
#7986
Issues
[] #12245
The text was updated successfully, but these errors were encountered: