-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Discuss] Moving asynchronous search repo to OpenSearch core #127
Comments
What are the cons? What can't be done as a plugin vs. being in core? |
I agree we should do this - What would we do with the current API exposed to customers? Will we be able to keep the backward compatibility once its moved to core? |
Asynchronous search seems like a low-level building block that other features and plugins would build on top of, and as such seems like a good candidate to build directly into the core. I'm basically rephrasing a question above, but is the intent here to move the exact functionality of the existing plugin into core, so that existing users would get the same experience they just wouldn't have to install an extra plugin? Or is the proposal to build essentially asynchronous search v2.0 that existing users would have to make changes to use? |
I would agree with @andrross, I would consider it to be baked into core (in most efficient way possible) since it is directly improving the usability of search when asynchronicity is preffered. |
The idea is to bake this into core, so that existing search users/other search plugins(SQL/PPL/kNN) are able to leverage the same and also to unify search experience by having a common search API with a mode(sync/async) to orchestrate the search For async search users this might be a breaking change once we cut over in a major version release(maybe 3.0), but till then we plan on supporting BWC |
Does the current implementation add thread pool tracking to the |
Async search uses the same Lines 87 to 95 in d57c851
|
Would it be fair to describe this as the following?
It potentially seems cleaner to reimplement this feature natively without ever needing to move the existing plugin into a core module. We can keep the plugin around for backward compatibility in the meantime, but once the native mechanism is ready and SQL, kNN, etc move to use it then we can remove support for the plugin. Is there anything that would prevent taking that approach? |
We would want to reuse code as much as possible, minimize code duplication. It doesn't seem optimal to reimplement(except for what's needed to move to core) or maintain redundant pieces of code at multiple places. I would choose it as a last resort to avoid breaking changes |
This is what I was quickly skimming. We need to be sure it is getting tracked so we can "easily" diagnose any thread pool issues if they arise. It makes handling stackOverflow questions much easier :)
@Bukhtawar it looks like a lot of the code is REST layer API implementation and business logic around managing search handlers. How much is actual core changes?
If a large chunk of the plugin is API implementation (e.g., say 80%) then I could see a strong justification to move it to core now so we can ensure consistency of the API as we move to segrep and decoupling readers from writers. |
@nknize Since this is a plugin the core changes are minimal. The plumbing work around search progress listeners to support async search largely sits in core already. The plugin majorly orchestrates it and supports add on features around persisting responses for later retrieval, cancelling long running searches, refreshing results as search progresses etc |
Thx @Bukhtawar! I can see a good reason for moving this as a plugin (not module) to core. The progress not perfection here is that we would be adding the APIs that we need regardless of the async search implementation details (docrep vs segrep). My concern would be enabling this as a module in the "first version" w/o proper threadpool tracking in the cat APIs. We certainly do not want to add resource usage we cannot monitor using the existing APIs. |
It would be also great to have client api for async search to be integrated with RestHighLevelClient |
Just a corollary: Would there be value in storing final async search response on S3 compared to storing it on disk. Not just a disk space gain but as a comparison between memory overhead of indexing a large async search response (with replicas) vs storing it on S3? @Bukhtawar thoughts? |
I see that nobody seems to argue against doing this -- it's been idle for 2 years, though. Could we just move the code from this repo to OpenSearch core's |
@msfroh That goes against what we've recently recommended for other plugins. We'd need to be super clear about why its the right decision in this case. If the long term plan is to natively integrate async search into the core (i.e. not an optional plugin) then that might make sense. However, if the likely result is that the code moves into the core repo but otherwise goes untouched (which is sort of the status quo for this feature as I believe there hasn't been any real non-maintenance work recently) then I'm not really convinced its the right thing to do. |
Aha -- I was assuming that's where we would go eventually. Make it a core feature of OpenSearch itself (that may not get much in the way of ongoing development, if it "just works"). |
I'll also note that 7 of the 19 open issues in this repo right now now are related to failing and/or flaky tests. That does not make me enthusiastic about bringing this code into the core repository. |
Is your feature request related to a problem? Please describe.
Creating this issue to gather thoughts on moving this repo to core. Historically some decisions were made before the fork was done and plugins were those limited ways to accomplish building asynchronous search. Reasons why moving to core makes sense
Pros
Cons
Describe the solution you'd like
Move the current repo to a module in OpenSearch core
Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.
Additional context
Add any other context or screenshots about the feature request here.
The text was updated successfully, but these errors were encountered: