[BUG] IndexSettings gets into invalid states on a cluster #12350
Labels
bug
Something isn't working
Storage:Snapshots
v2.13.0
Issues and PRs related to version 2.13.0
v3.0.0
Issues and PRs related to version 3.0.0
Describe the bug
During snapshot restore and other unknown scenarios settings are not validated or migrated to be compatible with future versions allowing clusters to get into states where new nodes cannot join the cluster because of validation checks. I have debugged clusters in this state, and have found records indicating these symptom has occurred many times.
Related component
Storage:Snapshots
To Reproduce
While I've seen clusters with green indices in this state, I don't know how that repo was possible
Alternative Repro
./gradlew :server:internalClusterTest --tests org.opensearch.snapshots.RestoreSnapshotInvalidStateIT
Expected behavior
Indices should not be created that are in invalid states, it seems there are ways to modifed IndexSettings that bypass checks - but prevent reloading in future scenarios.
MapperService
In the specific repo
index.mapper.dynamic
is not check as being valid during creation, but when MapperService is constructed. While this is discoverable when looking at failed shard counts, the index still exists in the cluster state.OpenSearch/server/src/main/java/org/opensearch/index/mapper/MapperService.java
Lines 263 to 265 in b19e427
MergeSchedulerConfig
There was a case where the
max_thread_count
was lower thanmax_merge_count
. When IndicesClusterStateService.deleteIndices(...) was called it threw an exception because it couldn't create the IndexSettings.OpenSearch/server/src/main/java/org/opensearch/index/MergeSchedulerConfig.java
Lines 139 to 143 in b19e427
Additional Details
Plugins
Please list all plugins currently enabled.
Screenshots
If applicable, add screenshots to help explain your problem.
Host/Environment (please complete the following information):
OpenSearch 2.3 & OpenSearch 2.7
Additional context
index.mapper.dynamic
usage instead of erroring #11193The text was updated successfully, but these errors were encountered: