Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Limit RW separation to remote store enabled clusters and update recovery flow #16760

Merged
merged 15 commits into from
Jan 10, 2025

Conversation

mch2
Copy link
Member

@mch2 mch2 commented Dec 2, 2024

Description

This PR includes multiple changes to search replica recovery to further decouple these shards from primaries.

  1. Change to recover as empty store instead of peer. This will run a store recovery that syncs segments from remote store directly and eliminate any primary communication.
  2. Remove search replicas from the in-sync allocation ID set and update routing table to exclude them from allAllocationIds. This ensures primaries aren't tracking or validating the routing table for any search replica's presence.
  3. Simplify RW separation by limiting to only remote store enabled clusters. There are versions of the above changes that are still possible with primary based node-node replication but they require additional public api changes and I don't think we have the need at this time.

Related Issues

Resolves #15952

Check List

  • Functionality includes testing.
  • API changes companion pull request created, if applicable.
  • Public documentation issue/PR created, if applicable.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@github-actions github-actions bot added the v2.19.0 Issues and PRs related to version 2.19.0 label Dec 2, 2024
Copy link
Contributor

github-actions bot commented Dec 2, 2024

❌ Gradle check result for a932d59:

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

github-actions bot commented Dec 2, 2024

❌ Gradle check result for a932d59:

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

github-actions bot commented Dec 3, 2024

✅ Gradle check result for a932d59: SUCCESS

Copy link

codecov bot commented Dec 3, 2024

Codecov Report

Attention: Patch coverage is 69.35484% with 19 lines in your changes missing coverage. Please review.

Project coverage is 72.10%. Comparing base (f6dc4a6) to head (1c95833).
Report is 3 commits behind head on main.

Files with missing lines Patch % Lines
...in/java/org/opensearch/index/shard/IndexShard.java 44.44% 2 Missing and 3 partials ⚠️
...search/cluster/routing/IndexShardRoutingTable.java 42.85% 2 Missing and 2 partials ⚠️
.../opensearch/cluster/routing/IndexRoutingTable.java 78.57% 2 Missing and 1 partial ⚠️
...llocation/decider/ThrottlingAllocationDecider.java 84.61% 0 Missing and 2 partials ⚠️
...a/org/opensearch/cluster/routing/ShardRouting.java 66.66% 0 Missing and 1 partial ⚠️
...uster/routing/allocation/IndexMetadataUpdater.java 75.00% 0 Missing and 1 partial ⚠️
...org/opensearch/index/seqno/ReplicationTracker.java 0.00% 0 Missing and 1 partial ⚠️
...a/org/opensearch/index/shard/ReplicationGroup.java 50.00% 0 Missing and 1 partial ⚠️
...java/org/opensearch/index/shard/StoreRecovery.java 80.00% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main   #16760      +/-   ##
============================================
- Coverage     72.17%   72.10%   -0.08%     
+ Complexity    65251    65197      -54     
============================================
  Files          5301     5301              
  Lines        303662   303723      +61     
  Branches      43989    44006      +17     
============================================
- Hits         219181   218994     -187     
- Misses        66552    66730     +178     
- Partials      17929    17999      +70     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@mch2 mch2 added the backport 2.x Backport to 2.x branch label Dec 3, 2024
Copy link
Contributor

github-actions bot commented Dec 3, 2024

✅ Gradle check result for 8935bc7: SUCCESS

mch2 added 7 commits January 8, 2025 17:13
Signed-off-by: Marc Handalian <[email protected]>
… a remote store cluster."

reverting this, we already check for remote store earlier.

This reverts commit 48ca1a3.

Signed-off-by: Marc Handalian <[email protected]>
Signed-off-by: Marc Handalian <[email protected]>
This commit adds PR feedback and recovery tests post node restart.

Signed-off-by: Marc Handalian <[email protected]>
@mch2
Copy link
Member Author

mch2 commented Jan 9, 2025

apologies for the rebase, had to fix DCO check on an old commit

Copy link
Contributor

github-actions bot commented Jan 9, 2025

❌ Gradle check result for cf68380: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Signed-off-by: Marc Handalian <[email protected]>
Copy link
Contributor

github-actions bot commented Jan 9, 2025

❌ Gradle check result for eaa38d9: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

github-actions bot commented Jan 9, 2025

✅ Gradle check result for eaa38d9: SUCCESS

Copy link
Contributor

❌ Gradle check result for 002323e: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

✅ Gradle check result for 002323e: SUCCESS

Copy link
Contributor

✅ Gradle check result for 1c95833: SUCCESS

@mch2 mch2 merged commit 8191de8 into opensearch-project:main Jan 10, 2025
36 of 37 checks passed
@opensearch-trigger-bot
Copy link
Contributor

The backport to 2.x failed:

The process '/usr/bin/git' failed with exit code 128

To backport manually, run these commands in your terminal:

# Navigate to the root of your repository
cd $(git rev-parse --show-toplevel)
# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/OpenSearch/backport-2.x 2.x
# Navigate to the new working tree
pushd ../.worktrees/OpenSearch/backport-2.x
# Create a new branch
git switch --create backport/backport-16760-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 8191de85856d291507d09a7fd425908843ed8675
# Push it to GitHub
git push --set-upstream origin backport/backport-16760-to-2.x
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/OpenSearch/backport-2.x

Then, create a pull request where the base branch is 2.x and the compare/head branch is backport/backport-16760-to-2.x.

mch2 added a commit to mch2/OpenSearch that referenced this pull request Jan 10, 2025
…ery flow (opensearch-project#16760)

* Update search only replica recovery flow

This PR includes multiple changes to search replica recovery.
1. Change search only replica copies to recover as empty store instead of PEER. This will run a store recovery that syncs segments from remote store directly and eliminate any primary communication.
2. Remove search replicas from the in-sync allocation ID set and update routing table to exclude them from allAllocationIds.  This ensures primaries aren't tracking or validating the routing table for any search replica's presence.
3. Change search replica validation to require remote store.  There are versions of the above changes that are still possible with primary based node-node replication, but I don't think they are worth making  at this time.

Signed-off-by: Marc Handalian <[email protected]>

* more coverage

Signed-off-by: Marc Handalian <[email protected]>

* add changelog entry

Signed-off-by: Marc Handalian <[email protected]>

* add assertions that Search Replicas are not in the in-sync id set nor the AllAllocationIds set in the routing table

Signed-off-by: Marc Handalian <[email protected]>

* update async task to only run if the FF is enabled and we are a remote store cluster.

This check had previously only checked for segrep

Signed-off-by: Marc Handalian <[email protected]>

* clean up max shards logic

Signed-off-by: Marc Handalian <[email protected]>

* remove search replicas from check during renewPeerRecoveryRetentionLeases

Signed-off-by: Marc Handalian <[email protected]>

* Revert "update async task to only run if the FF is enabled and we are a remote store cluster."

reverting this, we already check for remote store earlier.

This reverts commit 48ca1a3.

Signed-off-by: Marc Handalian <[email protected]>

* Add more tests for failover case

Signed-off-by: Marc Handalian <[email protected]>

* Update remotestore restore logic and add test ensuring we can restore only writers when red

Signed-off-by: Marc Handalian <[email protected]>

* Fix Search replicas to honor node level recovery limits

Signed-off-by: Marc Handalian <[email protected]>

* Fix translog UUID mismatch on existing store recovery.

This commit adds PR feedback and recovery tests post node restart.

Signed-off-by: Marc Handalian <[email protected]>

* Fix spotless

Signed-off-by: Marc Handalian <[email protected]>

* Fix bug with remote restore and add more tests

Signed-off-by: Marc Handalian <[email protected]>

---------

Signed-off-by: Marc Handalian <[email protected]>
(cherry picked from commit 8191de8)
msfroh pushed a commit that referenced this pull request Jan 10, 2025
…ery flow (#16760) (#16999)

* Update search only replica recovery flow

This PR includes multiple changes to search replica recovery.
1. Change search only replica copies to recover as empty store instead of PEER. This will run a store recovery that syncs segments from remote store directly and eliminate any primary communication.
2. Remove search replicas from the in-sync allocation ID set and update routing table to exclude them from allAllocationIds.  This ensures primaries aren't tracking or validating the routing table for any search replica's presence.
3. Change search replica validation to require remote store.  There are versions of the above changes that are still possible with primary based node-node replication, but I don't think they are worth making  at this time.

Signed-off-by: Marc Handalian <[email protected]>

* more coverage

Signed-off-by: Marc Handalian <[email protected]>

* add changelog entry

Signed-off-by: Marc Handalian <[email protected]>

* add assertions that Search Replicas are not in the in-sync id set nor the AllAllocationIds set in the routing table

Signed-off-by: Marc Handalian <[email protected]>

* update async task to only run if the FF is enabled and we are a remote store cluster.

This check had previously only checked for segrep

Signed-off-by: Marc Handalian <[email protected]>

* clean up max shards logic

Signed-off-by: Marc Handalian <[email protected]>

* remove search replicas from check during renewPeerRecoveryRetentionLeases

Signed-off-by: Marc Handalian <[email protected]>

* Revert "update async task to only run if the FF is enabled and we are a remote store cluster."

reverting this, we already check for remote store earlier.

This reverts commit 48ca1a3.

Signed-off-by: Marc Handalian <[email protected]>

* Add more tests for failover case

Signed-off-by: Marc Handalian <[email protected]>

* Update remotestore restore logic and add test ensuring we can restore only writers when red

Signed-off-by: Marc Handalian <[email protected]>

* Fix Search replicas to honor node level recovery limits

Signed-off-by: Marc Handalian <[email protected]>

* Fix translog UUID mismatch on existing store recovery.

This commit adds PR feedback and recovery tests post node restart.

Signed-off-by: Marc Handalian <[email protected]>

* Fix spotless

Signed-off-by: Marc Handalian <[email protected]>

* Fix bug with remote restore and add more tests

Signed-off-by: Marc Handalian <[email protected]>

---------

Signed-off-by: Marc Handalian <[email protected]>
(cherry picked from commit 8191de8)
meet-v25 pushed a commit to meet-v25/OpenSearch that referenced this pull request Jan 16, 2025
…ery flow (opensearch-project#16760)

* Update search only replica recovery flow

This PR includes multiple changes to search replica recovery.
1. Change search only replica copies to recover as empty store instead of PEER. This will run a store recovery that syncs segments from remote store directly and eliminate any primary communication.
2. Remove search replicas from the in-sync allocation ID set and update routing table to exclude them from allAllocationIds.  This ensures primaries aren't tracking or validating the routing table for any search replica's presence.
3. Change search replica validation to require remote store.  There are versions of the above changes that are still possible with primary based node-node replication, but I don't think they are worth making  at this time.

Signed-off-by: Marc Handalian <[email protected]>

* more coverage

Signed-off-by: Marc Handalian <[email protected]>

* add changelog entry

Signed-off-by: Marc Handalian <[email protected]>

* add assertions that Search Replicas are not in the in-sync id set nor the AllAllocationIds set in the routing table

Signed-off-by: Marc Handalian <[email protected]>

* update async task to only run if the FF is enabled and we are a remote store cluster.

This check had previously only checked for segrep

Signed-off-by: Marc Handalian <[email protected]>

* clean up max shards logic

Signed-off-by: Marc Handalian <[email protected]>

* remove search replicas from check during renewPeerRecoveryRetentionLeases

Signed-off-by: Marc Handalian <[email protected]>

* Revert "update async task to only run if the FF is enabled and we are a remote store cluster."

reverting this, we already check for remote store earlier.

This reverts commit 48ca1a3.

Signed-off-by: Marc Handalian <[email protected]>

* Add more tests for failover case

Signed-off-by: Marc Handalian <[email protected]>

* Update remotestore restore logic and add test ensuring we can restore only writers when red

Signed-off-by: Marc Handalian <[email protected]>

* Fix Search replicas to honor node level recovery limits

Signed-off-by: Marc Handalian <[email protected]>

* Fix translog UUID mismatch on existing store recovery.

This commit adds PR feedback and recovery tests post node restart.

Signed-off-by: Marc Handalian <[email protected]>

* Fix spotless

Signed-off-by: Marc Handalian <[email protected]>

* Fix bug with remote restore and add more tests

Signed-off-by: Marc Handalian <[email protected]>

---------

Signed-off-by: Marc Handalian <[email protected]>
meet-v25 pushed a commit to meet-v25/OpenSearch that referenced this pull request Jan 17, 2025
Signed-off-by: meetvm <[email protected]>

Bump com.nimbusds:oauth2-oidc-sdk from 11.19.1 to 11.20.1 in /plugins/repository-azure (opensearch-project#16895)

* Bump com.nimbusds:oauth2-oidc-sdk in /plugins/repository-azure

Bumps [com.nimbusds:oauth2-oidc-sdk](https://bitbucket.org/connect2id/oauth-2.0-sdk-with-openid-connect-extensions) from 11.19.1 to 11.20.1.
- [Changelog](https://bitbucket.org/connect2id/oauth-2.0-sdk-with-openid-connect-extensions/src/master/CHANGELOG.txt)
- [Commits](https://bitbucket.org/connect2id/oauth-2.0-sdk-with-openid-connect-extensions/branches/compare/11.20.1..11.19.1)

---
updated-dependencies:
- dependency-name: com.nimbusds:oauth2-oidc-sdk
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>

* Updating SHAs

Signed-off-by: dependabot[bot] <[email protected]>

* Update changelog

Signed-off-by: dependabot[bot] <[email protected]>

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: dependabot[bot] <dependabot[bot]@users.noreply.github.com>

Bump com.netflix.nebula.ospackage-base from 11.10.0 to 11.10.1 in /distribution/packages (opensearch-project#16896)

* Bump com.netflix.nebula.ospackage-base in /distribution/packages

Bumps com.netflix.nebula.ospackage-base from 11.10.0 to 11.10.1.

---
updated-dependencies:
- dependency-name: com.netflix.nebula.ospackage-base
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>

* Update changelog

Signed-off-by: dependabot[bot] <[email protected]>

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: dependabot[bot] <dependabot[bot]@users.noreply.github.com>

Bump ch.qos.logback:logback-classic from 1.5.12 to 1.5.15 in /test/fixtures/hdfs-fixture (opensearch-project#16898)

* Bump ch.qos.logback:logback-classic in /test/fixtures/hdfs-fixture

Bumps [ch.qos.logback:logback-classic](https://github.com/qos-ch/logback) from 1.5.12 to 1.5.15.
- [Commits](qos-ch/logback@v_1.5.12...v_1.5.15)

---
updated-dependencies:
- dependency-name: ch.qos.logback:logback-classic
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>

* Update changelog

Signed-off-by: dependabot[bot] <[email protected]>

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: dependabot[bot] <dependabot[bot]@users.noreply.github.com>

Bump lycheeverse/lychee-action from 2.1.0 to 2.2.0 (opensearch-project#16897)

* Bump lycheeverse/lychee-action from 2.1.0 to 2.2.0

Bumps [lycheeverse/lychee-action](https://github.com/lycheeverse/lychee-action) from 2.1.0 to 2.2.0.
- [Release notes](https://github.com/lycheeverse/lychee-action/releases)
- [Commits](lycheeverse/lychee-action@v2.1.0...v2.2.0)

---
updated-dependencies:
- dependency-name: lycheeverse/lychee-action
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>

* Update changelog

Signed-off-by: dependabot[bot] <[email protected]>

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: dependabot[bot] <dependabot[bot]@users.noreply.github.com>

Create sub directories for ThirdPartyAudit dependency metadata (opensearch-project#16844)

* Extract jars to sub dirs during thirdPartyAudit task.

Signed-off-by: Finn Carroll <[email protected]>

* Change regex to split on '-'/'.'. Ignore version.

Signed-off-by: Finn Carroll <[email protected]>

* Split on .jar for sub folder prefix.

Signed-off-by: Finn Carroll <[email protected]>

---------

Signed-off-by: Finn Carroll <[email protected]>

Retrieve value from DocValues in a flat_object filed (opensearch-project#16802)

Bump com.microsoft.azure:msal4j from 1.17.2 to 1.18.0 in /plugins/repository-azure (opensearch-project#16918)

* Bump com.microsoft.azure:msal4j in /plugins/repository-azure

Bumps [com.microsoft.azure:msal4j](https://github.com/AzureAD/microsoft-authentication-library-for-java) from 1.17.2 to 1.18.0.
- [Release notes](https://github.com/AzureAD/microsoft-authentication-library-for-java/releases)
- [Changelog](https://github.com/AzureAD/microsoft-authentication-library-for-java/blob/dev/changelog.txt)
- [Commits](AzureAD/microsoft-authentication-library-for-java@v1.17.2...v1.18.0)

---
updated-dependencies:
- dependency-name: com.microsoft.azure:msal4j
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>

* Updating SHAs

Signed-off-by: dependabot[bot] <[email protected]>

* Update changelog

Signed-off-by: dependabot[bot] <[email protected]>

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: dependabot[bot] <dependabot[bot]@users.noreply.github.com>

Bump org.apache.commons:commons-text from 1.12.0 to 1.13.0 in /test/fixtures/hdfs-fixture (opensearch-project#16919)

* Bump org.apache.commons:commons-text in /test/fixtures/hdfs-fixture

Bumps org.apache.commons:commons-text from 1.12.0 to 1.13.0.

---
updated-dependencies:
- dependency-name: org.apache.commons:commons-text
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>

* Update changelog

Signed-off-by: dependabot[bot] <[email protected]>

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: dependabot[bot] <dependabot[bot]@users.noreply.github.com>

Add gRPC server as transport-grpc plugin (opensearch-project#16534)

Introduce auxiliary transport to NetworkPlugin and add gRPC plugin.

Auxiliary transports are optional lifecycle components provided by
network plugins which run in parallel to the http server/native
transport. They are distinct from the existing NetworkPlugin
interfaces of 'getTransports' and 'getHttpTransports' as auxiliary
transports are optional. Each AuxTransport implements it's own
'aux.transport.type' and 'aux.transport.<type>.ports' setting. Since
Security.java initializes previous to Node.java during bootstrap
socket binding permissions are granted based on
'aux.transport.<type>.ports' for each enabled 'aux.transport.type',
falling back to a default if no ports are specified.

Signed-off-by: Finn Carroll <[email protected]>

Update script supports java.lang.String.sha1() and java.lang.String.sha256() methods (opensearch-project#16923)

* Update script supports java.lang.String.sha1() and java.lang.String.sha256() methods

Signed-off-by: Gao Binlong <[email protected]>

* Modify change log

Signed-off-by: Gao Binlong <[email protected]>

---------

Signed-off-by: Gao Binlong <[email protected]>

Workflow benchmark-pull-request.yml fix (opensearch-project#16925)

Signed-off-by: Prudhvi Godithi <[email protected]>

Add benchmark confirm for lucene-10 big5 index snapshot (opensearch-project#16940)

Signed-off-by: Rishabh Singh <[email protected]>

Remove duplicate DCO check (opensearch-project#16942)

Signed-off-by: Andriy Redko <[email protected]>

Allow extended plugins to be optional (opensearch-project#16909)

* Make extended plugins optional

Signed-off-by: Craig Perkins <[email protected]>

* Make extended plugins optional

Signed-off-by: Craig Perkins <[email protected]>

* Load extensions for classpath plugins

Signed-off-by: Craig Perkins <[email protected]>

* Ensure only single instance for each classpath extension

Signed-off-by: Craig Perkins <[email protected]>

* Add test for classpath plugin extended plugin loading

Signed-off-by: Craig Perkins <[email protected]>

* Modify test to allow optional extended plugin

Signed-off-by: Craig Perkins <[email protected]>

* Only optional extended plugins

Signed-off-by: Craig Perkins <[email protected]>

* Add additional warning message

Signed-off-by: Craig Perkins <[email protected]>

* Add to CHANGELOG

Signed-off-by: Craig Perkins <[email protected]>

* Add tag to make extended plugin optional

Signed-off-by: Craig Perkins <[email protected]>

* Only send plugin names when serializing PluginInfo

Signed-off-by: Craig Perkins <[email protected]>

* Keep track of optional extended plugins in separate set

Signed-off-by: Craig Perkins <[email protected]>

* Include in ser/de of PluginInfo

Signed-off-by: Craig Perkins <[email protected]>

* Change to 3_0_0

Signed-off-by: Craig Perkins <[email protected]>

---------

Signed-off-by: Craig Perkins <[email protected]>

Change version in PluginInfo to V_2_19_0 after backport to 2.x merged (opensearch-project#16947)

Signed-off-by: Craig Perkins <[email protected]>

Support object fields in star-tree index (opensearch-project#16728)

---------

Signed-off-by: bharath-techie <[email protected]>

Bump ch.qos.logback:logback-core from 1.5.12 to 1.5.16 in /test/fixtures/hdfs-fixture (opensearch-project#16951)

* Bump ch.qos.logback:logback-core in /test/fixtures/hdfs-fixture

Bumps [ch.qos.logback:logback-core](https://github.com/qos-ch/logback) from 1.5.12 to 1.5.16.
- [Commits](qos-ch/logback@v_1.5.12...v_1.5.16)

---
updated-dependencies:
- dependency-name: ch.qos.logback:logback-core
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>

* Update changelog

Signed-off-by: dependabot[bot] <[email protected]>

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: dependabot[bot] <dependabot[bot]@users.noreply.github.com>

[Workload Management] Add Workload Management IT (opensearch-project#16359)

* add workload management IT
Signed-off-by: Ruirui Zhang <[email protected]>

* address comments
Signed-off-by: Ruirui Zhang <[email protected]>

---------

Signed-off-by: Ruirui Zhang <[email protected]>

Add new benchmark config for nested workload (opensearch-project#16956)

Signed-off-by: Rishabh Singh <[email protected]>

Bump com.azure:azure-core-http-netty from 1.15.5 to 1.15.7 in /plugins/repository-azure (opensearch-project#16952)

* Bump com.azure:azure-core-http-netty in /plugins/repository-azure

Bumps [com.azure:azure-core-http-netty](https://github.com/Azure/azure-sdk-for-java) from 1.15.5 to 1.15.7.
- [Release notes](https://github.com/Azure/azure-sdk-for-java/releases)
- [Commits](Azure/azure-sdk-for-java@azure-core-http-netty_1.15.5...azure-core-http-netty_1.15.7)

---
updated-dependencies:
- dependency-name: com.azure:azure-core-http-netty
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>

* Updating SHAs

Signed-off-by: dependabot[bot] <[email protected]>

* Update changelog

Signed-off-by: dependabot[bot] <[email protected]>

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: dependabot[bot] <dependabot[bot]@users.noreply.github.com>

Always use constant_score query for match_only_text (opensearch-project#16964)

In some cases, when we create a term query over a `match_only_text`
field, it may still try to compute scores, which prevents early
termination. We should *always* use a constant score query when
querying `match_only_text`, since we don't have the statistics
required to compute scores.

---------

Signed-off-by: Michael Froh <[email protected]>

Changes to support unmapped fields in metric aggregation (opensearch-project#16481)

Avoids exception when querying unmapped field when star tree experimental
feature is enables.

---------

Signed-off-by: expani <[email protected]>

Use async client for delete blob or path in S3 Blob Container (opensearch-project#16788)

* Use async client for delete blob or path in S3 Blob Container

Signed-off-by: Ashish Singh <[email protected]>

* Fix UTs

Signed-off-by: Ashish Singh <[email protected]>

* Fix failures in S3BlobStoreRepositoryTests

Signed-off-by: Ashish Singh <[email protected]>

* Fix S3BlobStoreRepositoryTests

Signed-off-by: Ashish Singh <[email protected]>

* Fix failures in S3RepositoryThirdPartyTests

Signed-off-by: Ashish Singh <[email protected]>

* Fix failures in S3RepositoryPluginTests

Signed-off-by: Ashish Singh <[email protected]>

---------

Signed-off-by: Ashish Singh <[email protected]>

Fix Shallow copy snapshot failures on closed index (opensearch-project#16868)

* Fix shallow v1 snapshot failures on closed index

Signed-off-by: Shubh Sahu <[email protected]>

* UT fix

Signed-off-by: Shubh Sahu <[email protected]>

* Adding UT

Signed-off-by: Shubh Sahu <[email protected]>

* small fix

Signed-off-by: Shubh Sahu <[email protected]>

* Addressing comments

Signed-off-by: Shubh Sahu <[email protected]>

* Addressing comments

Signed-off-by: Shubh Sahu <[email protected]>

* Modifying IT to restore snapshot

Signed-off-by: Shubh Sahu <[email protected]>

---------

Signed-off-by: Shubh Sahu <[email protected]>
Co-authored-by: Shubh Sahu <[email protected]>

Add Response Status Number in http trace logs. (opensearch-project#16978)

Signed-off-by: Rishikesh1159 <[email protected]>

support termQueryCaseInsensitive/termQuery can search from doc_value in flat_object/keyword field (opensearch-project#16974)

Signed-off-by: kkewwei <[email protected]>
Signed-off-by: kkewwei <[email protected]>

use the correct type to widen the sort fields when merging top docs (opensearch-project#16881)

* use the correct type to widen the sort fields when merging top docs

Signed-off-by: panguixin <[email protected]>

* fix

Signed-off-by: panguixin <[email protected]>

* apply commments

Signed-off-by: panguixin <[email protected]>

* changelog

Signed-off-by: panguixin <[email protected]>

* add more tests

Signed-off-by: panguixin <[email protected]>

---------

Signed-off-by: panguixin <[email protected]>

Fix multi-value sort for unsigned long (opensearch-project#16732)

* Fix multi-value sort for unsigned long

Signed-off-by: panguixin <[email protected]>

* Add initial rest-api-spec tests

Signed-off-by: Andriy Redko <[email protected]>

* add more rest tests

Signed-off-by: panguixin <[email protected]>

* fix

Signed-off-by: panguixin <[email protected]>

* fix

Signed-off-by: panguixin <[email protected]>

* Extend MultiValueMode with dedicated support of unsigned_long doc values

Signed-off-by: Andriy Redko <[email protected]>

* Add CHANGELOG.md, minor cleanups

Signed-off-by: Andriy Redko <[email protected]>

* Correct the license headers

Signed-off-by: Andriy Redko <[email protected]>

* Correct the @publicapi version

Signed-off-by: Andriy Redko <[email protected]>

* Replace SingletonSortedNumericUnsignedLongValues with LongToSortedNumericUnsignedLongValues (as per review comments)

Signed-off-by: Andriy Redko <[email protected]>

---------

Signed-off-by: panguixin <[email protected]>
Signed-off-by: Andriy Redko <[email protected]>
Co-authored-by: Andriy Redko <[email protected]>

Update Gradle to 8.12 (opensearch-project#16884)

Signed-off-by: Andriy Redko <[email protected]>

`phone-search` analyzer: don't emit sip/tel prefix, int'l prefix, extension & unformatted input (opensearch-project#16993)

* `phone-search` analyzer: don't emit int'l prefix

this was an oversight in the initial implementation: if the tokenizer
emits the international calling prefix in the search analyzer then all
documents with the same international calling prefix will match.

e.g. when searching for `+1-555-123-4567` not only documents with this
number would match but also any other document with a `1` token (i.e.
any other number with this prefix).

thus the search functionality is currently broken for this analyzer,
making it useless.

the test coverage has now been extended to cover these and other
use-cases.

Signed-off-by: Ralph Ursprung <[email protected]>

* `phone-search` analyzer: don't emit extension & unformatted input

if these tokens are emitted it meant that phone numbers with other
international dialling prefixes still matched.

e.g. searching for `+1 1234` would also match a number stored as
`+2 1234`, which was wrong.

the tokens still need to be emited for the `phone` analyzer, e.g. when
the user only enters the extension / local number it should still match,
the same is with the other ngrams: these are needed for
search-as-you-type style queries where the user input needs to match
against partial phone numbers.

Signed-off-by: Ralph Ursprung <[email protected]>

* `phone-search` analyzer: don't emit sip/tel prefix

in line with the previous two commits, this is something else the search
analyzer shouldn't emit since otherwise searching for any number with
such a prefix will match _any_ document with the same prefix.

Signed-off-by: Ralph Ursprung <[email protected]>

---------

Signed-off-by: Ralph Ursprung <[email protected]>

Limit RW separation to remote store enabled clusters and update recovery flow (opensearch-project#16760)

* Update search only replica recovery flow

This PR includes multiple changes to search replica recovery.
1. Change search only replica copies to recover as empty store instead of PEER. This will run a store recovery that syncs segments from remote store directly and eliminate any primary communication.
2. Remove search replicas from the in-sync allocation ID set and update routing table to exclude them from allAllocationIds.  This ensures primaries aren't tracking or validating the routing table for any search replica's presence.
3. Change search replica validation to require remote store.  There are versions of the above changes that are still possible with primary based node-node replication, but I don't think they are worth making  at this time.

Signed-off-by: Marc Handalian <[email protected]>

* more coverage

Signed-off-by: Marc Handalian <[email protected]>

* add changelog entry

Signed-off-by: Marc Handalian <[email protected]>

* add assertions that Search Replicas are not in the in-sync id set nor the AllAllocationIds set in the routing table

Signed-off-by: Marc Handalian <[email protected]>

* update async task to only run if the FF is enabled and we are a remote store cluster.

This check had previously only checked for segrep

Signed-off-by: Marc Handalian <[email protected]>

* clean up max shards logic

Signed-off-by: Marc Handalian <[email protected]>

* remove search replicas from check during renewPeerRecoveryRetentionLeases

Signed-off-by: Marc Handalian <[email protected]>

* Revert "update async task to only run if the FF is enabled and we are a remote store cluster."

reverting this, we already check for remote store earlier.

This reverts commit 48ca1a3.

Signed-off-by: Marc Handalian <[email protected]>

* Add more tests for failover case

Signed-off-by: Marc Handalian <[email protected]>

* Update remotestore restore logic and add test ensuring we can restore only writers when red

Signed-off-by: Marc Handalian <[email protected]>

* Fix Search replicas to honor node level recovery limits

Signed-off-by: Marc Handalian <[email protected]>

* Fix translog UUID mismatch on existing store recovery.

This commit adds PR feedback and recovery tests post node restart.

Signed-off-by: Marc Handalian <[email protected]>

* Fix spotless

Signed-off-by: Marc Handalian <[email protected]>

* Fix bug with remote restore and add more tests

Signed-off-by: Marc Handalian <[email protected]>

---------

Signed-off-by: Marc Handalian <[email protected]>

Fix case insensitive and escaped query on wildcard (opensearch-project#16827)

* fix case insensitive and escaped query on wildcard

Signed-off-by: gesong.samuel <[email protected]>

* add changelog

Signed-off-by: gesong.samuel <[email protected]>

---------

Signed-off-by: gesong.samuel <[email protected]>
Signed-off-by: Michael Froh <[email protected]>
Co-authored-by: gesong.samuel <[email protected]>
Co-authored-by: Michael Froh <[email protected]>

Bump opentelemetry from 1.41.0 to 1.46.0 and opentelemetry-semconv from 1.27.0-alpha to 1.29.0-alpha (opensearch-project#17000)

Signed-off-by: Andriy Redko <[email protected]>

TransportBulkAction.doRun() (opensearch-project#16950)

Signed-off-by: kkewwei <[email protected]>
Signed-off-by: kkewwei <[email protected]>

Show only intersecting buckets to the Adjacency matrix aggregation (opensearch-project#11733)

Signed-off-by: Ivan Brusic <[email protected]>

Bump com.google.re2j:re2j from 1.7 to 1.8 in /plugins/repository-hdfs (opensearch-project#17012)

* Bump com.google.re2j:re2j from 1.7 to 1.8 in /plugins/repository-hdfs

Bumps [com.google.re2j:re2j](https://github.com/google/re2j) from 1.7 to 1.8.
- [Release notes](https://github.com/google/re2j/releases)
- [Commits](google/re2j@re2j-1.7...re2j-1.8)

---
updated-dependencies:
- dependency-name: com.google.re2j:re2j
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>

* Updating SHAs

Signed-off-by: dependabot[bot] <[email protected]>

* Update changelog

Signed-off-by: dependabot[bot] <[email protected]>

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

Bump com.nimbusds:oauth2-oidc-sdk from 11.20.1 to 11.21 in /plugins/repository-azure (opensearch-project#17010)

* Bump com.nimbusds:oauth2-oidc-sdk in /plugins/repository-azure

Bumps [com.nimbusds:oauth2-oidc-sdk](https://bitbucket.org/connect2id/oauth-2.0-sdk-with-openid-connect-extensions) from 11.20.1 to 11.21.
- [Changelog](https://bitbucket.org/connect2id/oauth-2.0-sdk-with-openid-connect-extensions/src/master/CHANGELOG.txt)
- [Commits](https://bitbucket.org/connect2id/oauth-2.0-sdk-with-openid-connect-extensions/branches/compare/11.21..11.20.1)

---
updated-dependencies:
- dependency-name: com.nimbusds:oauth2-oidc-sdk
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>

* Updating SHAs

Signed-off-by: dependabot[bot] <[email protected]>

* Update changelog

Signed-off-by: dependabot[bot] <[email protected]>

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

improve `PhoneNumberAnalyzerTests#testTelPrefixSearch` (opensearch-project#17016)

this way we ensure that it doesn't include any additional tokens which
we don't want.

this is a follow-up to commit 4d94399 / opensearch-project#16993.

Signed-off-by: Ralph Ursprung <[email protected]>

Filter shards for sliced search at coordinator (opensearch-project#16771)

* Filter shards for sliced search at coordinator

Prior to this commit, a sliced search would fan out to every shard,
then apply a MatchNoDocsQuery filter on shards that don't correspond
to the current slice. This still creates a (useless) search context
on each shard for every slice, though. For a long-running sliced
scroll, this can quickly exhaust the number of available scroll
contexts.

This change avoids fanning out to all the shards by checking at the
coordinator if a shard is matched by the current slice. This should
reduce the number of open scroll contexts to max(numShards, numSlices)
instead of numShards * numSlices.

---------

Signed-off-by: Michael Froh <[email protected]>

Upgrade HttpCore5/HttpClient5 to support ExtendedSocketOption in HttpAsyncClient (opensearch-project#16757)

* upgrade httpcore5/httpclient5 to support ExtendedSocketOption in HttpAsyncClient

Signed-off-by: kkewwei <[email protected]>
Signed-off-by: kkewwei <[email protected]>

* Use the Upgrade flow by default

Signed-off-by: Andriy Redko <[email protected]>

* Update Reactor Netty to 1.1.26.Final

Signed-off-by: Andriy Redko <[email protected]>

* Add SETTING_H2C_MAX_CONTENT_LENGTH to configure h2cMaxContentLength for reactor-netty4 transport

Signed-off-by: Andriy Redko <[email protected]>

* Update Apache HttpCore5 to 5.3.2

Signed-off-by: Andriy Redko <[email protected]>

---------

Signed-off-by: kkewwei <[email protected]>
Signed-off-by: kkewwei <[email protected]>
Signed-off-by: Andriy Redko <[email protected]>
Co-authored-by: Andriy Redko <[email protected]>

Update version checks for backport (opensearch-project#17030)

Signed-off-by: Michael Froh <[email protected]>
Signed-off-by: Andriy Redko <[email protected]>
Co-authored-by: Michael Froh <[email protected]>

Fix versions and breaking API changes (opensearch-project#17031)

Signed-off-by: Andriy Redko <[email protected]>

Bump com.nimbusds:nimbus-jose-jwt from 9.47 to 10.0.1 in /test/fixtures/hdfs-fixture (opensearch-project#17011)

* Bump com.nimbusds:nimbus-jose-jwt in /test/fixtures/hdfs-fixture

Bumps [com.nimbusds:nimbus-jose-jwt](https://bitbucket.org/connect2id/nimbus-jose-jwt) from 9.47 to 10.0.1.
- [Changelog](https://bitbucket.org/connect2id/nimbus-jose-jwt/src/master/CHANGELOG.txt)
- [Commits](https://bitbucket.org/connect2id/nimbus-jose-jwt/branches/compare/10.0.1..9.47)

---
updated-dependencies:
- dependency-name: com.nimbusds:nimbus-jose-jwt
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <[email protected]>

* Update changelog

Signed-off-by: dependabot[bot] <[email protected]>

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

Remove user data from logs when not in debug/trace mode (opensearch-project#17007)

* Remove user data from logs when not in debug/trace mode

Signed-off-by: Mohit Godwani <[email protected]>

Remove user data from logs when not in debug/trace mode (opensearch-project#17007)

* Remove user data from logs when not in debug/trace mode

Signed-off-by: Mohit Godwani <[email protected]>
Signed-off-by: meetvm <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Backport to 2.x branch backport-failed v2.19.0 Issues and PRs related to version 2.19.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[RW Separation] Change search replica recovery flow
4 participants