Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tool caling UI #373

Open
marchellodev opened this issue Dec 17, 2024 · 6 comments
Open

Tool caling UI #373

marchellodev opened this issue Dec 17, 2024 · 6 comments

Comments

@marchellodev
Copy link

When using Open Web UI's native search feature, we can see this UI that the model is using a tool:

image

After it's done, we can see the intermediate results:

image

It would be great if we could construct a similar UI to the user through pipelines.

@marchellodev
Copy link
Author

@PlebeiusGaragicus
Copy link

PlebeiusGaragicus commented Dec 18, 2024

This has been discussed before. I would LOVE this feature - for pipelines to be able to use __event_emitter__

I'm not sure why the suggestion of using a function includes an example in which a pipe() function is defined... I'm still misunderstanding the distinction between functions and pipes.

@ezavesky
Copy link

ezavesky commented Dec 23, 2024

+1 that this is a cool feature -- perhaps there are two requests? One for final output and context and the other for intermediate results for some long-running processes?

  • final output looks like passing some other response via output payload from a pipieline, as @marchellodev mentioned.
  • intermediate results look like the emitter framework with OpenWebUI mentioned by @PlebeiusGaragicus , but that will not be trivial because at first glance the best you can do for message passing is using the streaming services instead --- but this may be the fastest path to implementation

Also looks related to #229 in this repo.

@ezavesky
Copy link

ezavesky commented Jan 5, 2025

Coming back here after more experience with the code, it looks like the __emitter__ flow is within opewebui's modules only. Specifically, there are no constructs to pass that type of intermediate information over a simple HTTP response or streaming response that comes from the fastapi-based routing.

It may be a little unusual, but what about overloading the streamed responses?

  • Since these functions are all throwing strings and markdown, perhaps a special decoration on the output that can be returned as part of a streaming response would be good.
  • For example, you'd have yield statements that spit out decorated strings for intermediate updates and/or citation updates.
  • If you're calling this via CURL or another utility, you'll see some weird interspersed results, but hopefully you can parse them out and discard them with some simple tag wrapping (thinking XML embeds in the markdown/string)?

No demos or code in place yet, but thinking of KISS to avoid disrupting what is working well in other places already.

@PlebeiusGaragicus
Copy link

I am thinking exactly this for my own project - but it feels like a last resort and I'm trying to avoid it. Maybe it will work out though, it just seems hacky.

I really like the idea of an agent returning status/steps as well as output. It makes a much improved UI experience.

I very much love using my instance of Open WebUI - but for more uses that include "agentic" solutions it's not there yet. It still seems really focused on LLMs which I feel is soon to be "two generations" behind. It's the agents.. and then "flows" that I've been obsessing over lately - a là windsurf.ai type of AI solutions. Not LLMs just using tools.

I've been working on some code, but it's not ready yet - that does what's proposed above.

    async for event in graph.astream_events(input=input_data, version="v2"):
        kind = event["event"]
        if kind == "on_chat_model_stream":
            content = event["data"]["chunk"].content
            if content:
                # print("Sending:", content)
                # Replace newlines with encoded form for SSE
                content_encoded = content.replace('\n', '\\n')
                yield f"data: {content_encoded}\n\n"

Doing it this way very much improves performance and actually gets the tokens to stream - instead of large chunks. But I'm already not a fan of how it interferes with newlines and requires a very special format.

This is on my reading list before I add to the above: https://python.langchain.com/v0.1/docs/expression_language/streaming/#event-reference

@ezavesky
Copy link

ezavesky commented Jan 6, 2025

I like the detailed reference you found in langchain. Digging into event emitters more, the change required may be a little easier than we think.

  • openwebui appears to use streamed events with type = status to pass messages internally
  • in fact, part of the current emitter will log status messages in the chat history to be pulled up later (a little curious since we treat them more ephemerally than that, but no matter)
  • so maybe modifying the API response endpoint (as you're proposing) can use a data pattern that exists within openwebui already?

What might that look like? It might be emitting an entirely new data payload with "type":"status" (as in the middleware payload as opposed to reusing the existing delta/streaming format from OpenAI reference or the new data types that langchain emits (but your suggested reading does appear to be way more flexible).

To avoid pushing changes to both pipelines and openwebui (e.g. with a custom newline reformatting -- albeit, a clever+simple solution!), perhaps we experiment with some existing response padding to see if it is already handled through the above. Disclaimer: Still learning SSE and streamed responses, don't know full vision of this integration.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants