Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] IndexSettings gets into invalid states on a cluster #12350

Open
peternied opened this issue Feb 16, 2024 · 0 comments
Open

[BUG] IndexSettings gets into invalid states on a cluster #12350

peternied opened this issue Feb 16, 2024 · 0 comments
Labels
bug Something isn't working Storage:Snapshots v2.13.0 Issues and PRs related to version 2.13.0 v3.0.0 Issues and PRs related to version 3.0.0

Comments

@peternied
Copy link
Member

peternied commented Feb 16, 2024

Describe the bug

During snapshot restore and other unknown scenarios settings are not validated or migrated to be compatible with future versions allowing clusters to get into states where new nodes cannot join the cluster because of validation checks. I have debugged clusters in this state, and have found records indicating these symptom has occurred many times.

Related component

Storage:Snapshots

To Reproduce

While I've seen clusters with green indices in this state, I don't know how that repo was possible

  1. Create a snapshot
  2. Restore snapshot with settings
POST /_snapshot/my-opensearch-repo/my-first-snapshot/_restore
{
  "indices": "opendistro-reports-definitions",
  "ignore_unavailable": true,
  "include_global_state": false,
  "index_settings": {
     "index.mapper.dynamic": true
  }
}

Alternative Repro

  1. Checkout this branch main...peternied:OpenSearch-1:repro-restore-to-invalid-state
  2. ./gradlew :server:internalClusterTest --tests org.opensearch.snapshots.RestoreSnapshotInvalidStateIT

Expected behavior

Indices should not be created that are in invalid states, it seems there are ways to modifed IndexSettings that bypass checks - but prevent reloading in future scenarios.

MapperService

In the specific repo index.mapper.dynamic is not check as being valid during creation, but when MapperService is constructed. While this is discoverable when looking at failed shard counts, the index still exists in the cluster state.

if (INDEX_MAPPER_DYNAMIC_SETTING.exists(indexSettings.getSettings())) {
throw new IllegalArgumentException("Setting " + INDEX_MAPPER_DYNAMIC_SETTING.getKey() + " was removed after version 6.0.0");
}

MergeSchedulerConfig

There was a case where the max_thread_count was lower than max_merge_count. When IndicesClusterStateService.deleteIndices(...) was called it threw an exception because it couldn't create the IndexSettings.

if (maxThreadCount > maxMergeCount) {
throw new IllegalArgumentException(
"maxThreadCount (= " + maxThreadCount + ") should be <= maxMergeCount (= " + maxMergeCount + ")"
);
}

Additional Details

Plugins
Please list all plugins currently enabled.

Screenshots
If applicable, add screenshots to help explain your problem.

Host/Environment (please complete the following information):
OpenSearch 2.3 & OpenSearch 2.7

Additional context

@peternied peternied added bug Something isn't working untriaged Indexing Indexing, Bulk Indexing and anything related to indexing Cluster Manager labels Feb 16, 2024
@reta reta added v3.0.0 Issues and PRs related to version 3.0.0 v2.13.0 Issues and PRs related to version 2.13.0 labels Feb 21, 2024
@reta reta removed the untriaged label Feb 21, 2024
@mgodwan mgodwan removed the Indexing Indexing, Bulk Indexing and anything related to indexing label Feb 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Storage:Snapshots v2.13.0 Issues and PRs related to version 2.13.0 v3.0.0 Issues and PRs related to version 3.0.0
Projects
Status: 🆕 New
Status: No status
Development

No branches or pull requests

4 participants