Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for Llama2, Palm, Cohere, Replicate Models - using litellm #283

Closed
wants to merge 4 commits into from

Conversation

ishaan-jaff
Copy link

This PR adds support for models from all the above mentioned providers using https://github.com/BerriAI/litellm/

Here's a sample of how it's used:

from litellm import completion, acompletion

## set ENV variables
# ENV variables can be set in .env file, too. Example in .env.example
os.environ["OPENAI_API_KEY"] = "openai key"
os.environ["COHERE_API_KEY"] = "cohere key"

messages = [{ "content": "Hello, how are you?","role": "user"}]

# openai call
response = completion(model="gpt-3.5-turbo", messages=messages)

# llama2 call
model_name = "replicate/llama-2-70b-chat:2c1608e18606fad2812020dc541930f2d0495ce32eee50074220b87300bc16e1"
response = completion(model_name, messages)

# cohere call
response = completion("command-nightly", messages)

# anthropic call
response = completion(model="claude-instant-1", messages=messages)

@vercel
Copy link

vercel bot commented Aug 13, 2023

@ishaan-jaff is attempting to deploy a commit to the Danswer Team on Vercel.

A member of the Team first needs to authorize it.

@ishaan-jaff
Copy link
Author

@yuhongsun96 @Weves can I get a review on this PR ?😊

Happy to add more docs/tests if this initial commit looks good

@yuhongsun96
Copy link
Contributor

Hey @ishaan-jaff, this looks amazing! Would absolutely love to have this!

A few questions:

  • I assume this can also do normal completion (nonchat)? Would be good to use litellm throughout/consistently
  • Can this replace the huggingface client? we have a file dedicated to that, if this replaces it, it's better to have it unified. See huggingface.py
  • This is a fairly lightweight wrapper around the different clients right? It supports streaming, translating the parameters (like temperature), etc? How do you translate the parameters, different hosting services have different parameter value ranges?

If the above can be addressed, I'd love to just commit the your project completely.

Also, if you could commit some docs for how to set up and use the difference model hosting services, please check out our docs page: https://docs.danswer.dev/gen_ai_configs/overview

@ishaan-jaff
Copy link
Author

Hi @yuhongsun96

Currently working on ensuring we have support for HF models you use - will add a new commit once that's done + docs

@ishaan-jaff
Copy link
Author

If the above can be addressed, I'd love to just commit the your project completely.

Curious why ? Has maintaining existing llm providers been challenging ?

@yuhongsun96
Copy link
Contributor

If the above can be addressed, I'd love to just commit the your project completely.

Curious why ? Has maintaining existing llm providers been challenging ?

Since your project can handle all these disparate LLM providers, it just seems like an obvious win to unify it so we don't have to worry about introducing unnecessary code/complexity. And to support newer models, we could just bump your library version at some point down the line. It is not a great use of time on our end to build the same functionality many times when we can offload that to a external tool.

@yuhongsun96
Copy link
Contributor

In terms of reviewing the PR, here are some things we'd love to have:

  1. remove the huggingface client code and replace it with litellm library's equivalent (can just put it in open_ai.py)
  2. unify the open_ai.py file to use litellm throughout rather than just for chat
  3. docs on https://docs.danswer.dev/gen_ai_configs/overview (https://github.com/danswer-ai/danswer-docs)

We can refactor the code to be generic rather than referencing openai everywhere, no need to worry about that.

Also, quick question, how do you handle things like tokenizing? We use tiktoken to ensure that we don't pass in too large of a message to the model, but this is specific to OpenAI.

@ishaan-jaff ishaan-jaff reopened this Aug 15, 2023
@ishaan-jaff
Copy link
Author

@yuhongsun96 tested locally on docker and it's working with the new liteLLM changes 😊

Screenshot 2023-08-14 at 8 19 11 PM

docs on https://docs.danswer.dev/gen_ai_configs/overview (https://github.com/danswer-ai/danswer-docs)

As for this, all the changes in this PR don't impact your current configuration - so there's nothing to add to docs

@ishaan-jaff
Copy link
Author

Since this PR essentially starts using liteLLM here's how I was thinking of merging:

  • use this PR to begin using liteLLM for OpenAI + Hugging Face requests
  • Make a 2nd PR to add all liteLLM supported models + do any cleanup
  • Planning on adding docs on how to use liteLLM supported models in the 2nd PR

@yuhongsun96 let me know if this sounds good and if the PR looks good

@krrishdholakia
Copy link
Contributor

bump on this? @yuhongsun96

@yuhongsun96
Copy link
Contributor

Hey! Sorry for the delay! Quick question, we saw that Langchain similarly has a way to plug/play LLMs. What are your views on this?

The thinking being that Langchain is a larger project and is less likely to stop being maintained. But would love to hear about choosing litellm and the benefits.

@krrishdholakia
Copy link
Contributor

nice - what specifically within langchain did you see? @yuhongsun96

@Weves
Copy link
Contributor

Weves commented Aug 21, 2023

@ishaan-jaff
Copy link
Author

ishaan-jaff commented Aug 21, 2023

Thanks for sharing the link

  • How do you set the model there ?
  • Don't you still have to translate I/O if you use multiple different LLMs when you use BaseLLM ?

@yuhongsun96
Copy link
Contributor

Hey @ishaan-jaff, sorry for the really long turnaround on this one. We re-standardized our LLM interfaces:
https://github.com/danswer-ai/danswer/blob/main/backend/danswer/llm/llm.py#L15
https://github.com/danswer-ai/danswer/blob/main/backend/danswer/llm/openai.py#L29

Would LiteLLM still work given this? I imagine it would right? I still think there's a lot of value in using LiteLLM, would love to hear your thoughts

@krrishdholakia
Copy link
Contributor

Hi @yuhongsun96 - yes you can use our langchain class if you'd like - ChatLiteLLM https://python.langchain.com/docs/integrations/chat/litellm

Curious - what made you change your mind? @yuhongsun96 / @Weves

@yuhongsun96
Copy link
Contributor

We were going to get back to you after reworking our LLM interfaces, but we just got really swamped with other things and forgot about it. Then someone asked again for it so we figured we'd reach out and get this finally done.

It looks very easy to integrate just based on that page you sent, what are the env variables to configure the different models? I remember you saying that was the way to easily swap them?

@krrishdholakia
Copy link
Contributor

Hey @yuhongsun96 sounds good - made the PR here - #510

Let me know if i missed anything.

@yuhongsun96
Copy link
Contributor

Closing as #510 has been put in and cover this. Thanks for the contrib and your awesome lib!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants