-
Notifications
You must be signed in to change notification settings - Fork 136
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Runtime error when using explain=true with multiple script_score neural queries (Null score for the docID: 2147483647) #2176
Comments
Just to add, I'm using nmslib with a field mapping like this: "title_embedding": {
"type": "knn_vector",
"dimension": 384,
"method": {
"name": "hnsw",
"space_type": "l2",
"engine": "nmslib",
"parameters": {
"ef_construction": 128,
"m": 24
}
}
} I've just tested using the Lucene engine and the error does not occur with lucene. (as an aside, with lucene, the |
Explain logic not really supported in both neural-search and knn (that does the work under the hood). In neural-search explain functionality is not implemented, and knn has a mock implementation that returns a constant KNNWeight. Most probably the error you're seeing is a result of those mock results bubbled to the high level query like bool. While we should investigate the error, most like the explain not be fixed in a nearest future. |
Thanks @martin-gaievski! I'm ok if there is no detailed explain logic. My bug is just about the error being thrown when using the For example, here is what's shown for a successful query (with {
"hits": {
"max_score": 0.81763434,
"hits": [
{
<trimmed>
"_score": 0.81763434,
"_source": {},
"_explanation": {
"value": 0.81763434,
"description": "sum of:",
"details": [
{
"value": 0.42870614,
"description": "script score function, computed with script:\"Script{type=inline, lang='painless', idOrCode='_score * 1', options={}, params={}}\"",
"details": [
{
"value": 1,
"description": "_score: ",
"details": [
{
"value": 1,
"description": "No Explanation",
"details": []
}
]
}
]
},
{
"value": 0.38892817,
"description": "script score function, computed with script:\"Script{type=inline, lang='painless', idOrCode='_score * 1', options={}, params={}}\"",
"details": [
{
"value": 1,
"description": "_score: ",
"details": [
{
"value": 1,
"description": "No Explanation",
"details": []
}
]
}
]
}
]
}
}
]
}
} I think you're referring to the details showing a constant So in this example, that works! The important field is But that revealed the bug: referring to The reason I think it's a bug is because it only throws an error when So overall, without At least I think that is correct? If there's a different way of running multiple neural queries on multiple fields and getting the For now, I'm ok using the Lucene engine, where the bug doesn't occur. (In Lucene, if a doc appears in the top- |
transferring to knn as it seems its a knn issue |
@vamshin could you help add assignee? |
What is the bug?
When:
script_score
neural
queries on multiple (different) vector fields, like in this commentscript
references_score
explain=true
Then, if a document is returned by some neural field queries (within the sub-query's top-
k
) but not some others, the query fails with a script runtime exception and the error:Null score for the docID: 2147483647
(At least I think this is why... I'm new to OpenSearch and neural search, so apologies - my explanation for why this happens is just my best guess!)
How can one reproduce the bug?
title_embedding
anddescription_embedding
.See an error like:
Note the high
size
and lowk
. You might need to adjust thequery_text
ork
to find a combination where a document is returned in one neural query's topk
and not the other.Remove
explain=true
from the query and notice it succeeds.What is the expected behavior?
_score
for the affected field is 0 or the affected field is excluded entirely - either way, the_explanation
should accurately reflect this.What is your host/environment?
OpenSearch 2.7, Ubuntu 22.04.
Do you have any additional context?
I'm not sure why it only happens with
explain=true
. (I can't explain it)It also only happens if using
script_score
. If using multipleneural
queries directly, there is no error. But then there is no score per-field in_explanation
- the total is correct, but each field score value is reported as1
. #875 describes this problem. My use case is: I'd like to try using the similarity scores of each field as features in a Learning to Rank model, which means I need to get each score individually.The text was updated successfully, but these errors were encountered: