-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prototype search pipelines #97
Comments
Hi @msfroh, I could like to understand the concept behind the pipelines. So it is similar to Redis pipeline where to optimize round-trip times by batching tasks request in client side socket and send to server without waiting for the replies at all, and finally read the replies in a single step. This is the design model of the ingest and search pipelines? Thanks in advances! |
Hi @Jeevananthan-23, the motivation is about providing a (relatively) lightweight way to modify behavior of searches at the cluster level, since that may make more sense than modifying behavior at the application layer. For example, ingest pipelines provide a way of manipulating incoming documents on the cluster before they're sent for indexing. You could just modify the documents before sending them to the cluster in the first place, but maybe that's not convenient (like maybe you have multiple applications sending documents). Also, you get the open-source benefit where one person can write a useful ingest pipeline processor and share it with other OpenSearch users, who don't need to modify any of their applications' indexing code. On the search side, the specific thing we've been trying to tackle is final-stage rerankers (which is what we've been covering in https://github.com/opensearch-project/search-processor), where you want to run the collated search results through an external reranker to get more relevant results than you could get through term frequency-based relevance alone. You could send the results you get back from OpenSearch to the external reranker, but by letting the cluster drive the transformation you don't need to modify your search application. More importantly, one person can build and release a search pipeline processor that integrates with an external reranker, and many users can benefit without each having to modify their search application. Inspired by ingest pipelines, we realized that "functional operator" model (where an ingest pipeline processor is effectively a function that takes an The linked RFC goes into much more detail, but I hope the above is a useful summary. |
I'm going to put together a scrappy first implementation of search pipelines.
This first implementation will largely be a copy/paste from ingest pipelines.
I think it should be a good conversation-starter about whether/how to share implementation with ingest pipelines. Depending on where I get with this task, it may be throwaway learning code or it may be the first draft of what we eventually want to merge.
This task should be moved to the OpenSearch core project, but I'm creating it here as a placeholder.
The goals for my prototype include:
Features that can come later (but before "release") include:
ingest-common
).BracketProcessor
(processors that modify both request and response, with state carried from request time to response time).The text was updated successfully, but these errors were encountered: