stream=true requests cause "Object of type Stream is not JSON serializable" error #117

doublefx · 2024-12-28T18:14:36Z

When sending a stream=true request to optiLLM, the service encounters the following error:

{"error":"Object of type Stream is not JSON serializable"}

This error suggests that optiLLM is not properly handling streamed responses from liteLLM or OpenAI GPT-4o. Instead of processing the stream incrementally, it attempts to serialize the raw stream object directly into JSON, which causes the serialization failure.

Steps to Reproduce:
Send a POST request to optiLLM with the following payload:

{
    "model": "gpt-4o",
    "messages": [
        { "role": "system", "content": "You are a helpful assistant." },
        { "role": "user", "content": "Write a Python program to build an RL model using only numpy." }
    ],
    "max_tokens": 1000,
    "stream": true
}

Observe the error response:
{"error":"Object of type Stream is not JSON serializable"}

Expected Behavior:
The optiLLM service should handle streamed responses by:

Iterating through the stream of chunks from liteLLM or OpenAI.
Processing each chunk incrementally and forwarding it to the next layer (e.g., AnythingLLM).

Additional Context:

The following pipeline works properly: User -> AnythingLLM -> liteLLM -> openai/gpt-4o
The problem seems specific to how optiLLM handles the streamed response from liteLLM.

Severity:
High - This issue blocks the usage of stream=true functionality, which is critical for incremental responses in real-time applications.

The text was updated successfully, but these errors were encountered:

codelion · 2024-12-29T00:21:08Z

Most of the approaches require the full output and multiple calls to the LLM so we cannot stream the responses to the next layer as they come. We could handle it in optillm by waiting for the full stream to finish but the effect would be similar to using the underlying LLM without streaming.

av · 2024-12-29T10:19:20Z

Also encountered this while integrating OptiLLM.

A good middle ground for the LLM proxies is to stream what's possible. Every approach will have some portions that can be send back to the client for either traceability or as additional data even before the final response. In another proxy (don't want to link it) we called that "Intermediate outputs" and it can be toggled on/off based on the user preference.

However, this specific problem with OptiLLM break its compatibility with downstream services, for example - Open WebUI, which enables streaming by default. If a full streaming support is not planned (understandably, it's a big undertake) - a reasonable workaround is to imitate streaming interface and simply send the whole response in a single chunk when the workflow is finished.

codelion · 2024-12-29T12:27:13Z

However, this specific problem with OptiLLM break its compatibility with downstream services, for example - Open WebUI, which enables streaming by default. If a full streaming support is not planned (understandably, it's a big undertake) - a reasonable workaround is to imitate streaming interface and simply send the whole response in a single chunk when the workflow is finished.

This is already done, the request here is to enable streaming inputs from the inference server. I can add similar workaround as well to the inputs from inference server so as to not break anything. Good suggestion.

av · 2024-12-29T13:18:33Z

Yes, found the commit now, likely the streaming workaround is not fully compatible with Open WebUI due to some reason, thank you!

doublefx changed the title ~~stream=true requests cause "Object of type Stream is not JSON serializable" error in optimLLM~~ stream=true requests cause "Object of type Stream is not JSON serializable" error in optiLLM Dec 28, 2024

doublefx changed the title ~~stream=true requests cause "Object of type Stream is not JSON serializable" error in optiLLM~~ stream=true requests cause "Object of type Stream is not JSON serializable" error Dec 28, 2024

codelion added the bug Something isn't working label Dec 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

stream=true requests cause "Object of type Stream is not JSON serializable" error #117

stream=true requests cause "Object of type Stream is not JSON serializable" error #117

doublefx commented Dec 28, 2024 •

edited

Loading

codelion commented Dec 29, 2024

av commented Dec 29, 2024 •

edited

Loading

codelion commented Dec 29, 2024 •

edited

Loading

av commented Dec 29, 2024

stream=true requests cause "Object of type Stream is not JSON serializable" error #117

stream=true requests cause "Object of type Stream is not JSON serializable" error #117

Comments

doublefx commented Dec 28, 2024 • edited Loading

codelion commented Dec 29, 2024

av commented Dec 29, 2024 • edited Loading

codelion commented Dec 29, 2024 • edited Loading

av commented Dec 29, 2024

doublefx commented Dec 28, 2024 •

edited

Loading

av commented Dec 29, 2024 •

edited

Loading

codelion commented Dec 29, 2024 •

edited

Loading