Add support for Llama2, Palm, Cohere, Replicate Models - using litellm #283

ishaan-jaff · 2023-08-13T00:14:37Z

This PR adds support for models from all the above mentioned providers using https://github.com/BerriAI/litellm/

Here's a sample of how it's used:

from litellm import completion, acompletion

## set ENV variables
# ENV variables can be set in .env file, too. Example in .env.example
os.environ["OPENAI_API_KEY"] = "openai key"
os.environ["COHERE_API_KEY"] = "cohere key"

messages = [{ "content": "Hello, how are you?","role": "user"}]

# openai call
response = completion(model="gpt-3.5-turbo", messages=messages)

# llama2 call
model_name = "replicate/llama-2-70b-chat:2c1608e18606fad2812020dc541930f2d0495ce32eee50074220b87300bc16e1"
response = completion(model_name, messages)

# cohere call
response = completion("command-nightly", messages)

# anthropic call
response = completion(model="claude-instant-1", messages=messages)

vercel · 2023-08-13T00:14:40Z

@ishaan-jaff is attempting to deploy a commit to the Danswer Team on Vercel.

A member of the Team first needs to authorize it.

ishaan-jaff · 2023-08-13T00:18:36Z

@yuhongsun96 @Weves can I get a review on this PR ?😊

Happy to add more docs/tests if this initial commit looks good

yuhongsun96 · 2023-08-13T04:41:03Z

Hey @ishaan-jaff, this looks amazing! Would absolutely love to have this!

A few questions:

I assume this can also do normal completion (nonchat)? Would be good to use litellm throughout/consistently
Can this replace the huggingface client? we have a file dedicated to that, if this replaces it, it's better to have it unified. See huggingface.py
This is a fairly lightweight wrapper around the different clients right? It supports streaming, translating the parameters (like temperature), etc? How do you translate the parameters, different hosting services have different parameter value ranges?

If the above can be addressed, I'd love to just commit the your project completely.

Also, if you could commit some docs for how to set up and use the difference model hosting services, please check out our docs page: https://docs.danswer.dev/gen_ai_configs/overview

ishaan-jaff · 2023-08-14T18:51:15Z

Hi @yuhongsun96

Yes liteLLM handles completion
Yes it can replace hugging face client - from your docs it looks like you're using meta/llama2 - currently ensuring we have good support for that through hugging face
We manage param translation here: https://github.com/BerriAI/litellm/blob/main/litellm/utils.py#L233

Currently working on ensuring we have support for HF models you use - will add a new commit once that's done + docs

ishaan-jaff · 2023-08-14T18:52:02Z

If the above can be addressed, I'd love to just commit the your project completely.

Curious why ? Has maintaining existing llm providers been challenging ?

yuhongsun96 · 2023-08-14T20:37:18Z

If the above can be addressed, I'd love to just commit the your project completely.

Curious why ? Has maintaining existing llm providers been challenging ?

Since your project can handle all these disparate LLM providers, it just seems like an obvious win to unify it so we don't have to worry about introducing unnecessary code/complexity. And to support newer models, we could just bump your library version at some point down the line. It is not a great use of time on our end to build the same functionality many times when we can offload that to a external tool.

yuhongsun96 · 2023-08-14T20:42:21Z

In terms of reviewing the PR, here are some things we'd love to have:

remove the huggingface client code and replace it with litellm library's equivalent (can just put it in open_ai.py)
unify the open_ai.py file to use litellm throughout rather than just for chat
docs on https://docs.danswer.dev/gen_ai_configs/overview (https://github.com/danswer-ai/danswer-docs)

We can refactor the code to be generic rather than referencing openai everywhere, no need to worry about that.

Also, quick question, how do you handle things like tokenizing? We use tiktoken to ensure that we don't pass in too large of a message to the model, but this is specific to OpenAI.

ishaan-jaff · 2023-08-15T03:24:17Z

@yuhongsun96 tested locally on docker and it's working with the new liteLLM changes 😊

docs on https://docs.danswer.dev/gen_ai_configs/overview (https://github.com/danswer-ai/danswer-docs)

As for this, all the changes in this PR don't impact your current configuration - so there's nothing to add to docs

ishaan-jaff · 2023-08-15T03:25:50Z

Since this PR essentially starts using liteLLM here's how I was thinking of merging:

use this PR to begin using liteLLM for OpenAI + Hugging Face requests
Make a 2nd PR to add all liteLLM supported models + do any cleanup
Planning on adding docs on how to use liteLLM supported models in the 2nd PR

@yuhongsun96 let me know if this sounds good and if the PR looks good

krrishdholakia · 2023-08-21T20:32:58Z

bump on this? @yuhongsun96

yuhongsun96 · 2023-08-21T20:48:57Z

Hey! Sorry for the delay! Quick question, we saw that Langchain similarly has a way to plug/play LLMs. What are your views on this?

The thinking being that Langchain is a larger project and is less likely to stop being maintained. But would love to hear about choosing litellm and the benefits.

krrishdholakia · 2023-08-21T23:17:16Z

nice - what specifically within langchain did you see? @yuhongsun96

Weves · 2023-08-21T23:29:03Z

Hey @krrishdholakia! @yuhongsun96 is referring to https://github.com/langchain-ai/langchain/blob/master/libs/langchain/langchain/llms/base.py#L157 (and it's subclasses)

ishaan-jaff · 2023-08-21T23:37:37Z

Thanks for sharing the link

How do you set the model there ?
Don't you still have to translate I/O if you use multiple different LLMs when you use BaseLLM ?

yuhongsun96 · 2023-10-04T00:53:55Z

Hey @ishaan-jaff, sorry for the really long turnaround on this one. We re-standardized our LLM interfaces:
https://github.com/danswer-ai/danswer/blob/main/backend/danswer/llm/llm.py#L15
https://github.com/danswer-ai/danswer/blob/main/backend/danswer/llm/openai.py#L29

Would LiteLLM still work given this? I imagine it would right? I still think there's a lot of value in using LiteLLM, would love to hear your thoughts

krrishdholakia · 2023-10-04T01:20:18Z

Hi @yuhongsun96 - yes you can use our langchain class if you'd like - ChatLiteLLM https://python.langchain.com/docs/integrations/chat/litellm

Curious - what made you change your mind? @yuhongsun96 / @Weves

yuhongsun96 · 2023-10-04T03:05:08Z

We were going to get back to you after reworking our LLM interfaces, but we just got really swamped with other things and forgot about it. Then someone asked again for it so we figured we'd reach out and get this finally done.

It looks very easy to integrate just based on that page you sent, what are the env variables to configure the different models? I remember you saying that was the way to easily swap them?

krrishdholakia · 2023-10-04T22:01:33Z

Hey @yuhongsun96 sounds good - made the PR here - #510

Let me know if i missed anything.

yuhongsun96 · 2023-10-31T19:04:11Z

Closing as #510 has been put in and cover this. Thanks for the contrib and your awesome lib!

ishaan-jaff mentioned this pull request Aug 14, 2023

(feat) add GPT-4 All support BerriAI/litellm#115

Closed

ishaan-jaff closed this Aug 15, 2023

ishaan-jaff force-pushed the main branch from 1ba9fc8 to a905373 Compare August 15, 2023 00:39

fix merge conflict

b062544

ishaan-jaff reopened this Aug 15, 2023

ishaan-jaff added 3 commits August 14, 2023 17:47

remove HF, use litellm for defaults

d4c02d5

set litellm api key

7bd018c

working with liteLLM changes

1013cc7

krrishdholakia mentioned this pull request Oct 4, 2023

Add LiteLLM Support - Anthropic, Bedrock, Huggingface, TogetherAI, Replicate, etc. #510

Merged

yuhongsun96 closed this Oct 31, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for Llama2, Palm, Cohere, Replicate Models - using litellm #283

Add support for Llama2, Palm, Cohere, Replicate Models - using litellm #283

ishaan-jaff commented Aug 13, 2023

vercel bot commented Aug 13, 2023

ishaan-jaff commented Aug 13, 2023

yuhongsun96 commented Aug 13, 2023

ishaan-jaff commented Aug 14, 2023

ishaan-jaff commented Aug 14, 2023

yuhongsun96 commented Aug 14, 2023

yuhongsun96 commented Aug 14, 2023

ishaan-jaff commented Aug 15, 2023

ishaan-jaff commented Aug 15, 2023

krrishdholakia commented Aug 21, 2023

yuhongsun96 commented Aug 21, 2023

krrishdholakia commented Aug 21, 2023

Weves commented Aug 21, 2023

ishaan-jaff commented Aug 21, 2023 •

edited

Loading

yuhongsun96 commented Oct 4, 2023

krrishdholakia commented Oct 4, 2023

yuhongsun96 commented Oct 4, 2023

krrishdholakia commented Oct 4, 2023

yuhongsun96 commented Oct 31, 2023

Add support for Llama2, Palm, Cohere, Replicate Models - using litellm #283

Add support for Llama2, Palm, Cohere, Replicate Models - using litellm #283

Conversation

ishaan-jaff commented Aug 13, 2023

vercel bot commented Aug 13, 2023

ishaan-jaff commented Aug 13, 2023

yuhongsun96 commented Aug 13, 2023

ishaan-jaff commented Aug 14, 2023

ishaan-jaff commented Aug 14, 2023

yuhongsun96 commented Aug 14, 2023

yuhongsun96 commented Aug 14, 2023

ishaan-jaff commented Aug 15, 2023

ishaan-jaff commented Aug 15, 2023

krrishdholakia commented Aug 21, 2023

yuhongsun96 commented Aug 21, 2023

krrishdholakia commented Aug 21, 2023

Weves commented Aug 21, 2023

ishaan-jaff commented Aug 21, 2023 • edited Loading

yuhongsun96 commented Oct 4, 2023

krrishdholakia commented Oct 4, 2023

yuhongsun96 commented Oct 4, 2023

krrishdholakia commented Oct 4, 2023

yuhongsun96 commented Oct 31, 2023

ishaan-jaff commented Aug 21, 2023 •

edited

Loading