Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add documentation for remote store migration #7121

Merged
merged 11 commits into from
May 13, 2024
Original file line number Diff line number Diff line change
@@ -0,0 +1,235 @@
---
layout: default
title: Migrating to Remote-backed storage
nav_order: 5
parent: Remote-backed storage
grand_parent: Availability and recovery
---

# Migrating to remote-backed storage

This is an experimental feature and is not recommended for use in a production environment. For updates on the progress of the feature or if you want to leave feedback, see the associated [GitHub issue](https://github.com/opensearch-project/OpenSearch/issues/7986).
{: .warning}

Introduced 2.14
{: .label .label-purple }

Remote-backed storage offers OpenSearch users a new way to protect against data loss by automatically creating backups of all index transactions and sending them to remote storage. In order to expose this feature, segment replication must also be enabled. See [Segment replication]({{site.url}}{{site.baseurl}}/opensearch/segment-replication/) for additional information.
gbbafna marked this conversation as resolved.
Show resolved Hide resolved

We support migrating a document-replication based cluster to Remote-backed storage through Rolling Upgrade mechanism.
gbbafna marked this conversation as resolved.
Show resolved Hide resolved

Rolling upgrades, sometimes referred to as "node replacement upgrades", can be performed on running clusters with virtually no downtime. Nodes are individually stopped and upgraded in place. Alternatively, nodes can be stopped and replaced, one at a time, by hosts running the new version. During this process you can continue to index and query data in your cluster.

## Preparing to migrate

Review [Upgrading OpenSearch]({{site.url}}{{site.baseurl}}/upgrade-opensearch/index/) for recommendations about backing up your configuration files and creating a snapshot of the cluster state and indices before you make any changes to your OpenSearch cluster.

Check failure on line 25 in _tuning-your-cluster/availability-and-recovery/remote-store/migrating-to-remote.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.SubstitutionsError] Use 'indexes' instead of 'indices'. Raw Output: {"message": "[OpenSearch.SubstitutionsError] Use 'indexes' instead of 'indices'.", "location": {"path": "_tuning-your-cluster/availability-and-recovery/remote-store/migrating-to-remote.md", "range": {"start": {"line": 25, "column": 198}}}, "severity": "ERROR"}
gbbafna marked this conversation as resolved.
Show resolved Hide resolved

Users need to move to OpenSearch 2.14 version as a pre-requisite of this migration.
gbbafna marked this conversation as resolved.
Show resolved Hide resolved

**Important:** OpenSearch nodes cannot be migrated back to document replication as of 2.14. If you need to revert the migration, then you will need to perform a fresh installation of OpenSearch and restore the cluster from a snapshot. Take a snapshot and store it in a remote repository before beginning the upgrade procedure.
gbbafna marked this conversation as resolved.
Show resolved Hide resolved
{: .important}

## Performing the upgrade

1. Verify the health of your OpenSearch cluster before you begin. You should resolve any index or shard allocation issues prior to upgrading to ensure that your data is preserved. A status of **green** indicates that all primary and replica shards are allocated. See [Cluster health]({{site.url}}{{site.baseurl}}/api-reference/cluster-api/cluster-health/) for more information. The following command queries the `_cluster/health` API endpoint:
gbbafna marked this conversation as resolved.
Show resolved Hide resolved
gbbafna marked this conversation as resolved.
Show resolved Hide resolved
```json
GET "/_cluster/health?pretty"
```
The response should look similar to the following example:
gbbafna marked this conversation as resolved.
Show resolved Hide resolved
```json
{
"cluster_name":"opensearch-dev-cluster",
"status":"green",
"timed_out":false,
"number_of_nodes":4,
"number_of_data_nodes":4,
"active_primary_shards":1,
"active_shards":4,
"relocating_shards":0,
"initializing_shards":0,
"unassigned_shards":0,
"delayed_unassigned_shards":0,
"number_of_pending_tasks":0,
"number_of_in_flight_fetch":0,
"task_max_waiting_in_queue_millis":0,
"active_shards_percent_as_number":100.0
}
```
gbbafna marked this conversation as resolved.
Show resolved Hide resolved
1. Disable shard replication to prevent shard replicas from being created while nodes are being taken offline. This stops the movement of Lucene index segments on nodes in your cluster. You can disable shard replication by querying the `_cluster/settings` API endpoint:
gbbafna marked this conversation as resolved.
Show resolved Hide resolved
```json
PUT "/_cluster/settings?pretty"
{
"persistent": {
"cluster.routing.allocation.enable": "primaries"
}
}
```
The response should look similar to the following example:
gbbafna marked this conversation as resolved.
Show resolved Hide resolved
```json
{
"acknowledged" : true,
"persistent" : {
"cluster" : {
"routing" : {
"allocation" : {
"enable" : "primaries"
}
}
}
},
"transient" : { }
}
```

1. Perform a flush operation on the cluster to commit transaction log entries to the Lucene index:
gbbafna marked this conversation as resolved.
Show resolved Hide resolved
```json
POST "/_flush?pretty"
```
The response should look similar to the following example:
gbbafna marked this conversation as resolved.
Show resolved Hide resolved
```json
{
"_shards" : {
"total" : 4,
"successful" : 4,
"failed" : 0
}
}
```
1. Set the `remote_store.compatibility_mode` to `mixed` to allow remote-store backed nodes to join the cluster. Set `migration.direction` to ensure new indices are allocated to remote backed data nodes.

Check failure on line 98 in _tuning-your-cluster/availability-and-recovery/remote-store/migrating-to-remote.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.SubstitutionsError] Use 'indexes' instead of 'indices'. Raw Output: {"message": "[OpenSearch.SubstitutionsError] Use 'indexes' instead of 'indices'.", "location": {"path": "_tuning-your-cluster/availability-and-recovery/remote-store/migrating-to-remote.md", "range": {"start": {"line": 98, "column": 153}}}, "severity": "ERROR"}
gbbafna marked this conversation as resolved.
Show resolved Hide resolved
```json
PUT "/_cluster/settings?pretty"
{
"persistent": {
"remote_store.compatibility_mode": "mixed",
"migration.direction" : "remote_store"
}
}
```
gbbafna marked this conversation as resolved.
Show resolved Hide resolved
The response should look similar to the following example:
gbbafna marked this conversation as resolved.
Show resolved Hide resolved
```json
{
"acknowledged" : true,
"persistent" : {
"remote_store" : {
"compatibility_mode" : "mixed",
"migration.direction" : "remote_store"
},
"transient" : { }
}
}
```
2. Review your cluster and identify the first node to upgrade.
gbbafna marked this conversation as resolved.
Show resolved Hide resolved
2. Provide the remote store repository details as node attributes in `opensearch.yml`, as shown in the following example.
gbbafna marked this conversation as resolved.
Show resolved Hide resolved

```yml
# Repository name
node.attr.remote_store.segment.repository: my-repo-1
node.attr.remote_store.translog.repository: my-repo-2
gbbafna marked this conversation as resolved.
Show resolved Hide resolved
gbbafna marked this conversation as resolved.
Show resolved Hide resolved
node.attr.remote_store.state.repository: my-repo-3

# Segment repository settings
node.attr.remote_store.repository.my-repo-1.type: s3
node.attr.remote_store.repository.my-repo-1.settings.bucket: <Bucket Name 1>
node.attr.remote_store.repository.my-repo-1.settings.base_path: <Bucket Base Path 1>
node.attr.remote_store.repository.my-repo-1.settings.region: us-east-1
gbbafna marked this conversation as resolved.
Show resolved Hide resolved

# Translog repository settings
node.attr.remote_store.repository.my-repo-2.type: s3
node.attr.remote_store.repository.my-repo-2.settings.bucket: <Bucket Name 2>
node.attr.remote_store.repository.my-repo-2.settings.base_path: <Bucket Base Path 2>
node.attr.remote_store.repository.my-repo-2.settings.region: us-east-1
gbbafna marked this conversation as resolved.
Show resolved Hide resolved

# Enable Remote cluster state cluster setting
cluster.remote_store.state.enabled: true

# Remote cluster state repository settings
node.attr.remote_store.repository.my-remote-state-repo.type: s3
node.attr.remote_store.repository.my-remote-state-repo.settings.bucket: <Bucket Name 3>
node.attr.remote_store.repository.my-remote-state-repo.settings.base_path: <Bucket Base Path 3>
node.attr.remote_store.repository.my-remote-state-repo.settings.region: <Bucket region>

```
1. Stop the node you are migrating. Do not delete the volume associated with the container when you delete the container. The new OpenSearch container will use the existing volume. **Deleting the volume will result in data loss**.
gbbafna marked this conversation as resolved.
Show resolved Hide resolved
1. Deploy a new container running the same version of OpenSearch and mapped to the same volume as the container you deleted.
gbbafna marked this conversation as resolved.
Show resolved Hide resolved
1. Query the `_cat/nodes` endpoint after OpenSearch is running on the new node to confirm that it has joined the cluster. Wait for the cluster to become green again.
gbbafna marked this conversation as resolved.
Show resolved Hide resolved
1. Repeat steps 2 through 5 for each node in your cluster.
gbbafna marked this conversation as resolved.
Show resolved Hide resolved
gbbafna marked this conversation as resolved.
Show resolved Hide resolved
1. Reenable shard replication:
gbbafna marked this conversation as resolved.
Show resolved Hide resolved
gbbafna marked this conversation as resolved.
Show resolved Hide resolved
```json
PUT "/_cluster/settings?pretty"
{
"persistent": {
"cluster.routing.allocation.enable": "all"
}
}
```
gbbafna marked this conversation as resolved.
Show resolved Hide resolved
The response should look similar to the following example:
gbbafna marked this conversation as resolved.
Show resolved Hide resolved
```json
{
"acknowledged" : true,
"persistent" : {
"cluster" : {
"routing" : {
"allocation" : {
"enable" : "all"
}
}
}
},
"transient" : { }
}
```
gbbafna marked this conversation as resolved.
Show resolved Hide resolved
1. Confirm that the cluster is healthy:
gbbafna marked this conversation as resolved.
Show resolved Hide resolved
```bash
GET "/_cluster/health?pretty"
```
gbbafna marked this conversation as resolved.
Show resolved Hide resolved
The response should look similar to the following example:
gbbafna marked this conversation as resolved.
Show resolved Hide resolved
```json
{
"cluster_name" : "opensearch-dev-cluster",
"status" : "green",
"timed_out" : false,
"number_of_nodes" : 4,
"number_of_data_nodes" : 4,
"discovered_master" : true,
"active_primary_shards" : 1,
"active_shards" : 4,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 0,
"delayed_unassigned_shards" : 0,
"number_of_pending_tasks" : 0,
"number_of_in_flight_fetch" : 0,
"task_max_waiting_in_queue_millis" : 0,
"active_shards_percent_as_number" : 100.0
}
```
gbbafna marked this conversation as resolved.
Show resolved Hide resolved
1. Clear the `remote_store.compatibility_mode` to not allow non-remote nodes to join back the cluster and `migration.direction` as well.
gbbafna marked this conversation as resolved.
Show resolved Hide resolved
```json
PUT "/_cluster/settings?pretty"
{
"persistent": {
"remote_store.compatibility_mode": null,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried running this but it is failing currently (see opensearch-project/OpenSearch#13634). Maybe for 2.14 we can add a step to explicitly set this to strict? In 2.15 we can update the docs once we have the bugfix in.

"migration.direction" : null
}
}
```
The response should look similar to the following example:
gbbafna marked this conversation as resolved.
Show resolved Hide resolved
```json
{
"acknowledged" : true,
"persistent" : { },
"transient" : { }
}
```
1. The migration to remote store is now complete, and you can begin enjoying the durability and performance benefits.
gbbafna marked this conversation as resolved.
Show resolved Hide resolved
gbbafna marked this conversation as resolved.
Show resolved Hide resolved


## Related cluster settings

You can use the following cluster settings to enable migration.
gbbafna marked this conversation as resolved.
Show resolved Hide resolved

| Field | Data type | Description |
| :--- |:----------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| remote_store.compatibility_mode | String | Defaults to`strict` mode where it only allows either non-remote or remote nodes depending upon the initial cluster type. When set to `mixed`, it allows remote and non-remote nodes to join the cluster. | |
Naarcha-AWS marked this conversation as resolved.
Show resolved Hide resolved
| migration.direction | String | Defaults to `none` . `remote_store` direction creates new shards only on remote store backed nodes. |
Naarcha-AWS marked this conversation as resolved.
Show resolved Hide resolved
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A direction cannot technically create anything. Either "remote-backed storage nodes" or "remote-store-backed nodes".


Loading