-
-
Notifications
You must be signed in to change notification settings - Fork 448
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce size of swarms : Tick token using torch #705
Comments
@jmikedupont2 this will be difficult because we need to cound the tokens the agents uses. |
@jmikedupont2 solution i can think of is : setup a tiktoken api that counts the number of tokens in each request and returns the number of tokens. But this will take some time. How much is it adding to the server? |
@jmikedupont2 i'm removing 2 more packages to slim down the size ;) |
what if we move the token count to a separate litellm proxy server? |
that could work well, would we need an api for it? |
A litellm proxy server would run in the same environment as the swarms api server, and would wrap the incoming and outgoing api calls to the swarms api server. user -> request -> litellm server supabase(user_usage += len(request)) -> swarms api call -> litellm server supabase(user_usage += len(answer)) -> answer -> user The instructions to set up litellm are at: https://docs.litellm.ai/docs/proxy/deploy |
https://github.com/jmikedupont2/openlightllm I have a fork of litellm that I would like to continue working on that removes all the non-open source mess. |
I might have some terraform for that as well somewhere |
We currently seem to have a dependency in the api server on the ticktoken that uses torch,
this adds many gbs into the docker image. Can we please remove or refactor that?
The text was updated successfully, but these errors were encountered: