-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: [AG-178] AI Gateway bugs, 3.9.0 rollup #13932
base: master
Are you sure you want to change the base?
Conversation
ai_plugin_o11y.metrics_set("llm_completion_tokens_count", t.usage.completion_tokens) | ||
end | ||
end | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is still needed, when normalize-json-response is not enabled on a request (for example, using semantic-cache without ai proxy)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@oowl What do you recommend?
Just duplicate the code for now "QAD" with a if not (namespace-ai-proxy) then run_the_code end
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because actually, metadata extraction itself should be move to its own filter too...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's fine to run this code twice, the later one will just update/overwrite the previous, same as how we do to
headers and body.
metadata extraction itself should be move to its own filter too...
Ideally yes, but we are not in the real filter pipeline but still in Kong's plugin iterator.
Right now, move this to llm/shared to a utility function and to be called by both filters will be fine. Let's do this after 3.9.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So ideally:
for ai-proxy
parse-json-response | get metadata | normalize-json-response | get metadata
for ai-semantic-cache etc, without ai-proxy
parse-json-response | get metadata
Even if we do this, in real world, the plugin iterator still execute all "filters" for one plugin before executing the other, so if you have ai-proxy + ai-semantic-cache it would actually be:
parse-json-response | get metadata | parse-json-response (skipped) | get metadata | normalize-json-response | get metadata
so I think it's fine to just have get metadata part of the parse
and normalize
so we would have
parse-json-response| parse-json-response (skipped) | normalize-json-response
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll add it back and just double check the tests!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh you already did it! Okay am very confused now.
cc4b6bb
to
9bba90f
Compare
9bba90f
to
78319b9
Compare
3f89c40
to
b07d2f9
Compare
d7516b9
to
548a5e4
Compare
I've properly fixed the flakey tests |
caef22a
to
477b569
Compare
…nai format response
477b569
to
5aacc35
Compare
…nt and missing metadata
5aacc35
to
8b8f12c
Compare
Summary
AG-178
Bug fix rollup from 3.9.0.RC-1
Checklist
Issue reference
Fixes everything in AG-178.