-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Discussion on combination of result sets from different query types #4557
Comments
Would it make sense to extend rescoring to run on a higher level than shard (i.e. merge)? |
In general, I think it would be similar to rescore on coordinator level. The workflow I have in mind would:
I think extending rescore might be confusing from user perspective (i.e. when does it happen shard or coordinator). |
Yes this can be seen as a possible solution and I think we should do some brainstorming around how we want to do this. But as from requirements standpoint of the above ask, what Jack mentioned in the comment is the exact requirements. Do we want to call is rescore or some other name that we can debate. |
Take a look at this paper from the Pinecone team: https://arxiv.org/abs/2210.11934 Convex Combination shows promise as a simple/effective way to combine the two hybrid scores. (Normalize each, and regress to find a linear scaling factor between the two distributions). Distributed normalization of course is the challenge. Shard-level might be workable. I’d suggest starting with simply allowing the scores from each query type to be returned and combined as the developer sees fit. For example convex combination could be implemented by two script_score queries inside a dismax tie=1 (where each script score employed a precomputed normalization function and boost for each query type). |
What will be the order of execution of the BM25(may be multi-match) query and KNN query if we use a hybrid search query? |
We have not decided yet as of now. I am working on building some proposal around the same. Will update this issue once I have some research done. But on high level I was thinking to run the queries in parallel. So that we can reduce the latency and then do the normalizations of the scores before we combine them. But this is still very early thought. Will need to see what is best. |
@aishwaryabajaj-54 do we see a difference if the queries are performed in different order? As per my understanding if we are doing normalization of scores for each query and then combining the scores then it doesn't matter in which order the queries are executed. Please let us know if you think otherwise and is there can be a case where this assumption doesn't hold true. |
@dblock |
Thanks @navneet1v. Since we're talking additional phases in query and fetch, cc-ing @nknize who has been thinking about refactoring aggregation, which I believe is either on the same path or at least related. |
I think ElasticSearch has a solution for this https://www.elastic.co/guide/en/elasticsearch/reference/master/knn-search.html#_combine_approximate_knn_with_other_features, maybe this could be replicated in OpenSearch? I've created a feature request for this too with the same recommendation opensearch-project/k-NN#717 I realised that the query below simply retrieves based on keyword matching and then re-ranks based on
|
I've tried to adapt the recommended answer on a test index from this https://stackoverflow.com/questions/67701323/full-text-and-knn-vector-hybrid-search-for-elastic and it seems to work for me. Any thoughts on this solution? Perhaps latency concerns?
|
I have responded on the feature request. Lets use that issue to talk more about the feature request. Coming back the problem statement for this particular github issue, all the current score combinations work at Shard Level. We don't want to combine the scores at shard level, but at the Coordinator node level. Also the scores which are generated by keyword match and k-NN are not on the same scale. I can have K-NN scores between 0 to 1 and BM-25 and floating point values > 1. Quoting from description
|
Yeah that makes sense. One way could be to use a sigmoid function on the BM-25 score to map it between 0 and 1. The parameters of the function could be hyperparameters that the user selects. Example,
This |
We want to use something similar to get results from vector field along with exact match. Currently the scores received from both queries are not on the same scale as mentioned in the discussion above. Following is the query we expect to work to achieve so.
Will the queries above run in parallel or series? Search speed of BM25 is relatively slow. |
Hi @ankitas3, The speed of BM-25 depends on many factors. I would recommend running the multi-match query separately to find is BM-25 really running slow, if yes please try to optimize the BM-25 query. |
@navneet1v, As in continuation to what @ankitas3 is doing is it possible to execute the queries in parallel, or may be specifying the execution order to reduce the latency? |
@navneet1v I tried running BM25 separately, which is slow. Can you share any references to improve its speed. |
Right now there is no way to execute them in parallel in a single query. You can see _msearch api but I am not sure if you are looking for that. It works something like _bulk api. For improving the speed of the query I would recommend cutting a separate Github issue to talk, as this Github issue is not related to improving the BM-25 speed. In the new issue please make sure to provide details like OS version, Operating System, indices shards and number of nodes etc. On that GH issue we can talk around what things you can do to reduce the latency. @ankitas3
|
@navneet1v, I have tried _msearch and it has its own cons. Can we consider the speed and relevance benchmarking of parallel execution and normalization of scores of both queries (BM-25 and ANN) with the same GH issue? |
@aishwaryabajaj-54 can you please elaborate? what is your ask here? and it is related to what? |
I am closing this issue, as the problem defined above is getting taken care via : opensearch-project/neural-search#123. Please do a +1 on the issue to support the work. I will keep that other issue updated and will use that to track the progress. |
Recently, from k-NN, we have received interest in combining results from text matching queries with k-NN queries (ref). So, I wanted to start a discussion about result combination and the problems doing it when scores are computed at the shard level.
From my understanding, the only way to currently combine scores from different queries is through a boolean query. In a boolean query, multiple query clauses are provided in the should or must section and the scores are combined via addition. On top of this, with function scoring, it is possible to manipulate the score of the results further before the addition of scores.
That being said, I see a few limitations with this approach.
First, it is difficult to combine result sets whose scores are produced via different scoring methods. In order to effectively combine results, the different queries' scores would need to be put on the same scale. By this, I mean that a score would need to meet 2 requirements: (1) indicates its relative relevance between it and the other documents scored in the query and (2) also be comparable with the relative relevance of results from other queries. For example, for k-NN, the score range may be 0-1 while BM25 scoring would be 0-Float.MAX_INT (I think). With this, it would be difficult to figure out an effective way to weight each result appropriately. One way to do this would be to normalize the scores before combining them. Normalization might be possible through rescoring, but this would happen at the shard level, which could cause problems when results are combined. For instance, if one shard has better results than another, normalization may skew the importance so that the top results from the latter shard are better than the former shard.
Second, it is not possible to consider global hits for re-ranking. Because scores are assigned at the shard level, any rescoring has to be done at the shard level. I see a problem with this in two cases: first, for score combination, if an index has a significant number of shards, a user may want only the top results to be combined instead of combining for the results of all shards in the index and then aggregating at the coordinator; second, in the future, a user may have a model that they want to run the results through to re-rank them that is expensive and they dont want to run for each shard in the index.
That being said, my preliminary thoughts to creating a solution for this would be to create some kind of search and merge api that might look like:
Here the queries would be a list of queries to be executed independently and merge would contain the logic to combine the results at the coordinator level.
All that being said, I want to see what people think of doing this? Is there a way to accomplish this with the current search interface?
The text was updated successfully, but these errors were encountered: