-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Compatibility with segment replication #303
Comments
Request plugin onwers to add |
Hi Plugin Owners, |
For async search we need to do 'primary-first' preference routing to maintain consistency of 'Get asynchronous search results' API - but we need to do this change post we open up preference routing for segrep + carbon. So we'll need to defer the fix to 2.10.0 |
@bharath-techie: Can you add the the |
@bharath-techie : With opensearch-project/OpenSearch#8536 core now also supports realtime reads for segment replication enabled indices. Based on your comment above, it seems you are using GET by ids. So, if you are already using Get by ids only for realtime reads, then there is no change needed from plugin side. But, I will let plugins owners to decide and validate. |
Thanks @dreamer-89 . Closing this as this change is no longer required. |
Thanks @bharath-techie for looking into this. Caution: Please do verify that for strong reads, your plugin relies only on get/mget APIs. I am asking as it is still possible to have strong reads guarantees via write paths by using IMMEDIATE/ WAIT_UNTIL refresh policy which ensures replica shard copies are refreshed with the indexing request, ensuring any follow up data retrieval request receives latest data. With SEGMENT both IMMEDIATE/WAIT_UNTIL does not guarantee replica shard refreshes, so for strong reads, any follow up data retrieval query (other than get/mget APIs), client would need to provide either _primary - hits primary shard only but provides strong consistency or _primary_first - hits primary first (as name suggest), if primary not available request is routed to replica copies resulting in better availability (though data when hitting replica could be stale). |
Summary
With 2.9.0 release, there are lot of enhancements going in for segment replication[1][2] feature (went GA in 2.7.0), we need to ensure different plugins are compatible with current state of this feature. Previously, we ran tests on plugin repos to verify this compatibility but want plugin owners to be aware of these changes so that required updates (if any) can be made. With
2.10.0
release, remote store feature is going GA which internally uses SEGMENT replication strategy only i.e. it enforces all indices to useSEGMENT
replication strategy. So, it is important to validate plugins are compatible with segment replication feature.What changed
1. Refresh policy behavior
2. Refresh lag on replicas
With segment replication, there is inherent delay in documents to be searchable on replica shard copies. This is due to the fact that replica shard copies over data (segment) files from primary. Thus, compared to document replication, there will be on average increase in amount of time the replica shards are consistent with primaries.
3. System/hidden indices support
With opensearch-project/OpenSearch#8200, system and hidden indices are now supported with
SEGMENT
replication strategy. We need to ensure there are no bottlenecks which prevents system/hidden indices with segment replication.Next steps
With segment replication strong reads are not guaranteed. Thus, if the plugin needs strong reads guarantees specially as alternative to change in behavior of refresh policy and lag on replicas (point 1 and 2 above), we need to update search requests to target primary shard only. With opensearch-project/OpenSearch#7375, core now supports primary shards only based search. Please follow documentation for examples and details
Open questions
In case of any questions or issues, please post it in core issue
Reference
[1] Design
[2] Documentation
The text was updated successfully, but these errors were encountered: