-
Notifications
You must be signed in to change notification settings - Fork 146
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
stream=true requests cause "Object of type Stream is not JSON serializable" error #117
Comments
Most of the approaches require the full output and multiple calls to the LLM so we cannot stream the responses to the next layer as they come. We could handle it in optillm by waiting for the full stream to finish but the effect would be similar to using the underlying LLM without streaming. |
Also encountered this while integrating OptiLLM. A good middle ground for the LLM proxies is to stream what's possible. Every approach will have some portions that can be send back to the client for either traceability or as additional data even before the final response. In another proxy (don't want to link it) we called that "Intermediate outputs" and it can be toggled on/off based on the user preference. However, this specific problem with OptiLLM break its compatibility with downstream services, for example - Open WebUI, which enables streaming by default. If a full streaming support is not planned (understandably, it's a big undertake) - a reasonable workaround is to imitate streaming interface and simply send the whole response in a single chunk when the workflow is finished. |
This is already done, the request here is to enable streaming inputs from the inference server. I can add similar workaround as well to the inputs from inference server so as to not break anything. Good suggestion. |
Yes, found the commit now, likely the streaming workaround is not fully compatible with Open WebUI due to some reason, thank you! |
When sending a
stream=true
request to optiLLM, the service encounters the following error:{"error":"Object of type Stream is not JSON serializable"}
This error suggests that optiLLM is not properly handling streamed responses from liteLLM or OpenAI GPT-4o. Instead of processing the stream incrementally, it attempts to serialize the raw stream object directly into JSON, which causes the serialization failure.
Steps to Reproduce:
Send a POST request to optiLLM with the following payload:
Observe the error response:
{"error":"Object of type Stream is not JSON serializable"}
Expected Behavior:
The optiLLM service should handle streamed responses by:
Additional Context:
Severity:
High - This issue blocks the usage of stream=true functionality, which is critical for incremental responses in real-time applications.
The text was updated successfully, but these errors were encountered: