Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add remote state publication #7364

Merged
merged 10 commits into from
Jun 14, 2024
Original file line number Diff line number Diff line change
Expand Up @@ -54,12 +54,44 @@

Setting | Default | Description
:--- | :--- | :---
`cluster.remote_store.state.index_metadata.upload_timeout` | 20s | The amount of time to wait for index metadata upload to complete. Note that index metadata for separate indexes is uploaded in parallel.
`cluster.remote_store.state.global_metadata.upload_timeout` | 20s | The amount of time to wait for global metadata upload to complete. Global metadata contains globally applicable metadata, such as templates, cluster settings, data stream metadata, and repository metadata.
`cluster.remote_store.state.metadata_manifest.upload_timeout` | 20s | The amount of time to wait for the manifest file upload to complete. The manifest file contains the details of each of the files uploaded for a single cluster state, both index metadata files and global metadata files.
`cluster.remote_store.state.index_metadata.upload_timeout` | 20s | Deprecated. Use `cluster.remote_store.state.global_metadata.upload_timeout` instead.
`cluster.remote_store.state.global_metadata.upload_timeout` | 20s | The amount of time to wait for cluster state upload to complete.
Naarcha-AWS marked this conversation as resolved.
Show resolved Hide resolved
`cluster.remote_store.state.metadata_manifest.upload_timeout` | 20s | The amount of time to wait for the manifest file upload to complete. The manifest file contains the details of each of the files uploaded for a single cluster state, both index metadata files and global metadata files.
`cluster.remote_store.state.cleanup_interval` | 300s | The interval for remote state clean-up async task to run. This task deletes the old remote state files.

Check failure on line 60 in _tuning-your-cluster/availability-and-recovery/remote-store/remote-cluster-state.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.Spelling] Error: async. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks. Raw Output: {"message": "[OpenSearch.Spelling] Error: async. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.", "location": {"path": "_tuning-your-cluster/availability-and-recovery/remote-store/remote-cluster-state.md", "range": {"start": {"line": 60, "column": 95}}}, "severity": "ERROR"}


## Limitations

The remote cluster state functionality has the following limitations:
- Unsafe bootstrap scripts cannot be run when the remote cluster state is enabled. When a majority of cluster-manager nodes are lost and the cluster goes down, the user needs to replace any remaining cluster manager nodes and reseed the nodes in order to bootstrap a new cluster.

## Remote Cluster State Publication

Check failure on line 68 in _tuning-your-cluster/availability-and-recovery/remote-store/remote-cluster-state.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.HeadingCapitalization] 'Remote Cluster State Publication' is a heading and should be in sentence case. Raw Output: {"message": "[OpenSearch.HeadingCapitalization] 'Remote Cluster State Publication' is a heading and should be in sentence case.", "location": {"path": "_tuning-your-cluster/availability-and-recovery/remote-store/remote-cluster-state.md", "range": {"start": {"line": 68, "column": 4}}}, "severity": "ERROR"}
The cluster state published to remote-backed storage can be used for publication. Currently, the active cluster manager
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cluster manager node processes the updates to cluster state. It then, publishes the updated cluster state over the local transport layer to all the follower nodes. With remote cluster state enabled, cluster state is backed to remote store with every state update. The follower nodes can fetch the state from remote store directly and reducing the overhead on the cluster manager node for publication. This can be done by enabling the experimental remote publication feature.
Enable the feature flag for remote_store.publication feature by following the experiment feature flag documentation. This doesn't change the publication flow and follower nodes will not send acknowledgement back to cluster manager until they download the updated cluster state from remote store and proceed as expected in current flow. Please note enabling remote cluster state is mandatory for remote publication to work.
Also, RoutingTable which contains the shard allocation details for each index in the cluster state requires setting up the remote blobstore repository. It can be configured as below:

sends the cluster state object over the transport layer to the follower nodes. This flow can be changed to fetch the
cluster state from remote store. This can be done by enabling the experimental remote publication feature.
Enable the feature flag for `remote_store.publication` feature by following the [experiment feature flag documentation]({{site.url}}{{site.baseurl}}/install-and-configure/configuring-opensearch/experimental/).
When remote publication is enabled, the cluster manager node uploads the cluster state to remote store and then sends the
remote path of the cluster state to the follower nodes. The follower nodes then download the cluster state from remote store.

The routing table is an object within the cluster state which contains the shard allocation details for each index.
This object can become large in case of large number of shards in the cluster. Routing table is required to be stored in
remote store for the remote publication to work. In order to enable remote persistence of routing table, the repository must
be configured as below:

Check warning on line 79 in _tuning-your-cluster/availability-and-recovery/remote-store/remote-cluster-state.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.DirectionAboveBelow] Use 'following or later' instead of 'below' for versions or orientation within a document. Use 'above' and 'below' only for physical space or screen descriptions. Raw Output: {"message": "[OpenSearch.DirectionAboveBelow] Use 'following or later' instead of 'below' for versions or orientation within a document. Use 'above' and 'below' only for physical space or screen descriptions.", "location": {"path": "_tuning-your-cluster/availability-and-recovery/remote-store/remote-cluster-state.md", "range": {"start": {"line": 79, "column": 18}}}, "severity": "WARNING"}

```yml
# Remote routing table repository settings
node.attr.remote_store.routing_table.repository: my-remote-routing-table-repo
node.attr.remote_store.repository.my-remote-routing-table-repo.type: s3
node.attr.remote_store.repository.my-remote-routing-table-repo.settings.bucket: <Bucket Name 3>
node.attr.remote_store.repository.my-remote-routing-table-repo.settings.region: <Bucket region>
```
You do not have to use different remote store repositories for state and routing.
These stores can share the same repository.

The relevant cluster settings for remote publication are listed below:

Check warning on line 91 in _tuning-your-cluster/availability-and-recovery/remote-store/remote-cluster-state.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.DirectionAboveBelow] Use 'following or later' instead of 'below' for versions or orientation within a document. Use 'above' and 'below' only for physical space or screen descriptions. Raw Output: {"message": "[OpenSearch.DirectionAboveBelow] Use 'following or later' instead of 'below' for versions or orientation within a document. Use 'above' and 'below' only for physical space or screen descriptions.", "location": {"path": "_tuning-your-cluster/availability-and-recovery/remote-store/remote-cluster-state.md", "range": {"start": {"line": 91, "column": 65}}}, "severity": "WARNING"}

Setting | Default | Description
:--- | :--- | :---
`cluster.remote_store.state.read_timeout` | 20s | The amount of time to wait for remote state download to complete on the follower node.
`cluster.remote_store.routing_table.path_type` | HASHED_PREFIX | Path type to be used for creating index routing path in blob store. Valid values are "FIXED", "HASHED_PREFIX", "HASHED_INFIX"
Naarcha-AWS marked this conversation as resolved.
Show resolved Hide resolved
Naarcha-AWS marked this conversation as resolved.
Show resolved Hide resolved
`cluster.remote_store.routing_table.path_hash_algo` | FNV_1A_BASE64 | Algorithm to be used for constructing prefix or infix of blob store path. This setting comes into effect into if cluster.remote_store.routing_table.path_type is "hashed_prefix" or "hashed_infix". Valid values of algo are "FNV_1A_BASE64" or "FNV_1A_COMPOSITE_1"

Check failure on line 97 in _tuning-your-cluster/availability-and-recovery/remote-store/remote-cluster-state.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.Spelling] Error: algo. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks. Raw Output: {"message": "[OpenSearch.Spelling] Error: algo. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.", "location": {"path": "_tuning-your-cluster/availability-and-recovery/remote-store/remote-cluster-state.md", "range": {"start": {"line": 97, "column": 283}}}, "severity": "ERROR"}
Loading