http ratelimit: option to reduce budget on stream done #37548
Check was skipped
This check was not triggered in this CI run
Details
Request (pr/37548/main@011a816)
@mathetake 011a816
#37548 merge
main@602a2b9
http ratelimit: option to reduce budget on stream done
Commit Message: ratelimit: option to excute action on stream done
Additional Description:
This adds a new optionapply_on_stream_done
to the Action
message of the http ratelimit. This basically allows to configure
actions to be executed in a response content-aware way and do not
enforce the rate limit (in other words "fire-and-forget"). Since addend
can be currently controlled viaenvoy.ratelimit.hits_addend
metadata,
another filter can be used to set the value to reflect their intent there,
for example, by using Lua or Ext Proc filters.This use case arises from the LLM API services which usually return
the usage statistics in the response body. More specifically,
they have "streaming" APIs whose response is a line-by-line event
stream where the very last line of the response line contains the
usage statistics. The lazy nature of this action is perfectly fine
as in these use cases, the rate limit happens like "you are forbidden
from the next time".Besides the LLM specific, I've also encountered the use case from the
data center resource allocation case where the operators want to
"block the computation from the next time since you used this much
resources in this request".Risk Level: low
Testing: TODO
Docs Changes: done (via comments in proto)
Release Notes: TODO
Platform Specific Features: n/a
the description might not reflect the actual change as it's being discussed and developed - please refer to the diff for now
Environment
Request variables
Key | Value |
---|---|
ref | f9e79c7 |
sha | 011a816 |
pr | 37548 |
base-sha | 602a2b9 |
actor | @mathetake |
message | http ratelimit: option to reduce budget on stream done ... |
started | 1733966086.098874 |
target-branch | main |
trusted | false |
Build image
Container image/s (as used in this CI run)
Key | Value |
---|---|
default | envoyproxy/envoy-build-ubuntu:f94a38f62220a2b017878b790b6ea98a0f6c5f9c |
mobile | envoyproxy/envoy-build-ubuntu:mobile-f94a38f62220a2b017878b790b6ea98a0f6c5f9c |
Version
Envoy version (as used in this CI run)
Key | Value |
---|---|
major | 1 |
minor | 33 |
patch | 0 |
dev | true |