Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: [AG-178] AI Gateway bugs, 3.9.0 rollup #13932

Open
wants to merge 10 commits into
base: master
Choose a base branch
from

Conversation

tysoekong
Copy link
Contributor

Summary

AG-178

Bug fix rollup from 3.9.0.RC-1

Checklist

  • The Pull Request has tests
  • Doesn't need changelog - unreleased features
  • Doesn't need new docs - bug fixes

Issue reference

Fixes everything in AG-178.

@tysoekong tysoekong changed the title Fix/ag 178 ai bugs 3 9 0 rollup fix: [AG-178] AI Gateway bugs, 3.9.0 rollup Nov 27, 2024
kong/llm/drivers/shared.lua Outdated Show resolved Hide resolved
kong/llm/plugin/observability.lua Outdated Show resolved Hide resolved
ai_plugin_o11y.metrics_set("llm_completion_tokens_count", t.usage.completion_tokens)
end
end

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is still needed, when normalize-json-response is not enabled on a request (for example, using semantic-cache without ai proxy)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@oowl What do you recommend?

Just duplicate the code for now "QAD" with a if not (namespace-ai-proxy) then run_the_code end ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because actually, metadata extraction itself should be move to its own filter too...

Copy link
Contributor

@fffonion fffonion Nov 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's fine to run this code twice, the later one will just update/overwrite the previous, same as how we do to
headers and body.

metadata extraction itself should be move to its own filter too...

Ideally yes, but we are not in the real filter pipeline but still in Kong's plugin iterator.
Right now, move this to llm/shared to a utility function and to be called by both filters will be fine. Let's do this after 3.9.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So ideally:
for ai-proxy
parse-json-response | get metadata | normalize-json-response | get metadata
for ai-semantic-cache etc, without ai-proxy
parse-json-response | get metadata

Even if we do this, in real world, the plugin iterator still execute all "filters" for one plugin before executing the other, so if you have ai-proxy + ai-semantic-cache it would actually be:

parse-json-response | get metadata | parse-json-response (skipped) | get metadata | normalize-json-response | get metadata

so I think it's fine to just have get metadata part of the parse and normalize so we would have
parse-json-response| parse-json-response (skipped) | normalize-json-response

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add it back and just double check the tests!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh you already did it! Okay am very confused now.

@fffonion fffonion force-pushed the fix/AG-178_ai_bugs_3_9_0_rollup branch 2 times, most recently from cc4b6bb to 9bba90f Compare November 27, 2024 05:22
@fffonion fffonion added this to the 3.9.0 milestone Nov 27, 2024
@fffonion fffonion force-pushed the fix/AG-178_ai_bugs_3_9_0_rollup branch from 9bba90f to 78319b9 Compare November 27, 2024 07:35
@fffonion fffonion force-pushed the fix/AG-178_ai_bugs_3_9_0_rollup branch from 3f89c40 to b07d2f9 Compare November 28, 2024 07:45
@tysoekong tysoekong force-pushed the fix/AG-178_ai_bugs_3_9_0_rollup branch 2 times, most recently from d7516b9 to 548a5e4 Compare November 28, 2024 11:23
@tysoekong
Copy link
Contributor Author

I've properly fixed the flakey tests

@tysoekong tysoekong force-pushed the fix/AG-178_ai_bugs_3_9_0_rollup branch 2 times, most recently from caef22a to 477b569 Compare November 28, 2024 12:40
@tysoekong tysoekong force-pushed the fix/AG-178_ai_bugs_3_9_0_rollup branch from 477b569 to 5aacc35 Compare November 28, 2024 13:21
@tysoekong tysoekong force-pushed the fix/AG-178_ai_bugs_3_9_0_rollup branch from 5aacc35 to 8b8f12c Compare November 28, 2024 13:51
@tysoekong
Copy link
Contributor Author

@fffonion @oowl Finally... good to go.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants