Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

with_structured_output not working with OpenAI ChatLiteLLM #28176

Open
5 tasks done
chenzimin opened this issue Nov 18, 2024 · 1 comment
Open
5 tasks done

with_structured_output not working with OpenAI ChatLiteLLM #28176

chenzimin opened this issue Nov 18, 2024 · 1 comment

Comments

@chenzimin
Copy link

Checked other resources

  • I added a very descriptive title to this issue.
  • I searched the LangChain documentation with the integrated search.
  • I used the GitHub search to find a similar question and didn't find it.
  • I am sure that this is a bug in LangChain rather than my code.
  • The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).

Example Code

The following code

from langchain_community.chat_models import ChatLiteLLM
from langchain_core.messages import HumanMessage
from pydantic import BaseModel, Field
import os

os.environ["OPENAI_API_KEY"] = "xxx"

class Joke(BaseModel):
    setup: str = Field(description="The setup of the joke")
    punchline: str = Field(description="The punchline to the joke")

model = ChatLiteLLM(model="gpt-4o")
structured_llm = model.with_structured_output(Joke)
structured_llm.invoke("Tell me a joke about cats")

will raise

BadRequestError: litellm.BadRequestError: OpenAIException - Error code: 400 - {'error': {'message': "Invalid value: 'any'. Supported values are: 'none', 'auto', and 'required'.", 'type': 'invalid_request_error', 'param': 'tool_choice', 'code': 'invalid_value'}}

Error Message and Stack Trace (if applicable)

[/usr/local/lib/python3.10/dist-packages/litellm/llms/OpenAI/openai.py](https://localhost:8080/#) in completion(self, model_response, timeout, optional_params, logging_obj, model, messages, print_verbose, api_key, api_base, acompletion, litellm_params, logger_fn, headers, custom_prompt_dict, client, organization, custom_llm_provider, drop_params)
    789                         headers, response = (
--> 790                             self.make_sync_openai_chat_completion_request(
    791                                 openai_client=openai_client,

[/usr/local/lib/python3.10/dist-packages/litellm/llms/OpenAI/openai.py](https://localhost:8080/#) in make_sync_openai_chat_completion_request(self, openai_client, data, timeout)
    650             else:
--> 651                 raise e
    652 

[/usr/local/lib/python3.10/dist-packages/litellm/llms/OpenAI/openai.py](https://localhost:8080/#) in make_sync_openai_chat_completion_request(self, openai_client, data, timeout)
    632         try:
--> 633             raw_response = openai_client.chat.completions.with_raw_response.create(
    634                 **data, timeout=timeout

[/usr/local/lib/python3.10/dist-packages/openai/_legacy_response.py](https://localhost:8080/#) in wrapped(*args, **kwargs)
    355 
--> 356         return cast(LegacyAPIResponse[R], func(*args, **kwargs))
    357 

[/usr/local/lib/python3.10/dist-packages/openai/_utils/_utils.py](https://localhost:8080/#) in wrapper(*args, **kwargs)
    274                 raise TypeError(msg)
--> 275             return func(*args, **kwargs)
    276 

[/usr/local/lib/python3.10/dist-packages/openai/resources/chat/completions.py](https://localhost:8080/#) in create(self, messages, model, audio, frequency_penalty, function_call, functions, logit_bias, logprobs, max_completion_tokens, max_tokens, metadata, modalities, n, parallel_tool_calls, prediction, presence_penalty, response_format, seed, service_tier, stop, store, stream, stream_options, temperature, tool_choice, tools, top_logprobs, top_p, user, extra_headers, extra_query, extra_body, timeout)
    828         validate_response_format(response_format)
--> 829         return self._post(
    830             "/chat/completions",

[/usr/local/lib/python3.10/dist-packages/openai/_base_client.py](https://localhost:8080/#) in post(self, path, cast_to, body, options, files, stream, stream_cls)
   1277         )
-> 1278         return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
   1279 

[/usr/local/lib/python3.10/dist-packages/openai/_base_client.py](https://localhost:8080/#) in request(self, cast_to, options, remaining_retries, stream, stream_cls)
    954 
--> 955         return self._request(
    956             cast_to=cast_to,

[/usr/local/lib/python3.10/dist-packages/openai/_base_client.py](https://localhost:8080/#) in _request(self, cast_to, options, retries_taken, stream, stream_cls)
   1058             log.debug("Re-raising status error")
-> 1059             raise self._make_status_error_from_response(err.response) from None
   1060 

BadRequestError: Error code: 400 - {'error': {'message': "Invalid value: 'any'. Supported values are: 'none', 'auto', and 'required'.", 'type': 'invalid_request_error', 'param': 'tool_choice', 'code': 'invalid_value'}}

During handling of the above exception, another exception occurred:

OpenAIError                               Traceback (most recent call last)
[/usr/local/lib/python3.10/dist-packages/litellm/main.py](https://localhost:8080/#) in completion(model, messages, timeout, temperature, top_p, n, stream, stream_options, stop, max_completion_tokens, max_tokens, modalities, prediction, audio, presence_penalty, frequency_penalty, logit_bias, user, response_format, seed, tools, tool_choice, logprobs, top_logprobs, parallel_tool_calls, deployment_id, extra_headers, functions, function_call, base_url, api_version, api_key, model_list, **kwargs)
   1605                 )
-> 1606                 raise e
   1607 

[/usr/local/lib/python3.10/dist-packages/litellm/main.py](https://localhost:8080/#) in completion(model, messages, timeout, temperature, top_p, n, stream, stream_options, stop, max_completion_tokens, max_tokens, modalities, prediction, audio, presence_penalty, frequency_penalty, logit_bias, user, response_format, seed, tools, tool_choice, logprobs, top_logprobs, parallel_tool_calls, deployment_id, extra_headers, functions, function_call, base_url, api_version, api_key, model_list, **kwargs)
   1578                 else:
-> 1579                     response = openai_chat_completions.completion(
   1580                         model=model,

[/usr/local/lib/python3.10/dist-packages/litellm/llms/OpenAI/openai.py](https://localhost:8080/#) in completion(self, model_response, timeout, optional_params, logging_obj, model, messages, print_verbose, api_key, api_base, acompletion, litellm_params, logger_fn, headers, custom_prompt_dict, client, organization, custom_llm_provider, drop_params)
    863                 error_headers = getattr(error_response, "headers", None)
--> 864             raise OpenAIError(
    865                 status_code=status_code, message=error_text, headers=error_headers

OpenAIError: Error code: 400 - {'error': {'message': "Invalid value: 'any'. Supported values are: 'none', 'auto', and 'required'.", 'type': 'invalid_request_error', 'param': 'tool_choice', 'code': 'invalid_value'}}

During handling of the above exception, another exception occurred:

BadRequestError                           Traceback (most recent call last)
[<ipython-input-3-181fee518d8c>](https://localhost:8080/#) in <cell line: 14>()
     12 model = ChatLiteLLM(model="gpt-4o")
     13 structured_llm = model.with_structured_output(Joke)
---> 14 structured_llm.invoke("Tell me a joke about cats")

[/usr/local/lib/python3.10/dist-packages/langchain_core/runnables/base.py](https://localhost:8080/#) in invoke(self, input, config, **kwargs)
   3020                 context.run(_set_config_context, config)
   3021                 if i == 0:
-> 3022                     input = context.run(step.invoke, input, config, **kwargs)
   3023                 else:
   3024                     input = context.run(step.invoke, input, config)

[/usr/local/lib/python3.10/dist-packages/langchain_core/runnables/base.py](https://localhost:8080/#) in invoke(self, input, config, **kwargs)
   5352         **kwargs: Optional[Any],
   5353     ) -> Output:
-> 5354         return self.bound.invoke(
   5355             input,
   5356             self._merge_configs(config),

[/usr/local/lib/python3.10/dist-packages/langchain_core/language_models/chat_models.py](https://localhost:8080/#) in invoke(self, input, config, stop, **kwargs)
    284         return cast(
    285             ChatGeneration,
--> 286             self.generate_prompt(
    287                 [self._convert_input(input)],
    288                 stop=stop,

[/usr/local/lib/python3.10/dist-packages/langchain_core/language_models/chat_models.py](https://localhost:8080/#) in generate_prompt(self, prompts, stop, callbacks, **kwargs)
    784     ) -> LLMResult:
    785         prompt_messages = [p.to_messages() for p in prompts]
--> 786         return self.generate(prompt_messages, stop=stop, callbacks=callbacks, **kwargs)
    787 
    788     async def agenerate_prompt(

[/usr/local/lib/python3.10/dist-packages/langchain_core/language_models/chat_models.py](https://localhost:8080/#) in generate(self, messages, stop, callbacks, tags, metadata, run_name, run_id, **kwargs)
    641                 if run_managers:
    642                     run_managers[i].on_llm_error(e, response=LLMResult(generations=[]))
--> 643                 raise e
    644         flattened_outputs = [
    645             LLMResult(generations=[res.generations], llm_output=res.llm_output)  # type: ignore[list-item]

[/usr/local/lib/python3.10/dist-packages/langchain_core/language_models/chat_models.py](https://localhost:8080/#) in generate(self, messages, stop, callbacks, tags, metadata, run_name, run_id, **kwargs)
    631             try:
    632                 results.append(
--> 633                     self._generate_with_cache(
    634                         m,
    635                         stop=stop,

[/usr/local/lib/python3.10/dist-packages/langchain_core/language_models/chat_models.py](https://localhost:8080/#) in _generate_with_cache(self, messages, stop, run_manager, **kwargs)
    849         else:
    850             if inspect.signature(self._generate).parameters.get("run_manager"):
--> 851                 result = self._generate(
    852                     messages, stop=stop, run_manager=run_manager, **kwargs
    853                 )

[/usr/local/lib/python3.10/dist-packages/langchain_community/chat_models/litellm.py](https://localhost:8080/#) in _generate(self, messages, stop, run_manager, stream, **kwargs)
    357         message_dicts, params = self._create_message_dicts(messages, stop)
    358         params = {**params, **kwargs}
--> 359         response = self.completion_with_retry(
    360             messages=message_dicts, run_manager=run_manager, **params
    361         )

[/usr/local/lib/python3.10/dist-packages/langchain_community/chat_models/litellm.py](https://localhost:8080/#) in completion_with_retry(self, run_manager, **kwargs)
    290             return self.client.completion(**kwargs)
    291 
--> 292         return _completion_with_retry(**kwargs)
    293 
    294     @pre_init

[/usr/local/lib/python3.10/dist-packages/tenacity/__init__.py](https://localhost:8080/#) in wrapped_f(*args, **kw)
    334             copy = self.copy()
    335             wrapped_f.statistics = copy.statistics  # type: ignore[attr-defined]
--> 336             return copy(f, *args, **kw)
    337 
    338         def retry_with(*args: t.Any, **kwargs: t.Any) -> WrappedFn:

[/usr/local/lib/python3.10/dist-packages/tenacity/__init__.py](https://localhost:8080/#) in __call__(self, fn, *args, **kwargs)
    473         retry_state = RetryCallState(retry_object=self, fn=fn, args=args, kwargs=kwargs)
    474         while True:
--> 475             do = self.iter(retry_state=retry_state)
    476             if isinstance(do, DoAttempt):
    477                 try:

[/usr/local/lib/python3.10/dist-packages/tenacity/__init__.py](https://localhost:8080/#) in iter(self, retry_state)
    374         result = None
    375         for action in self.iter_state.actions:
--> 376             result = action(retry_state)
    377         return result
    378 

[/usr/local/lib/python3.10/dist-packages/tenacity/__init__.py](https://localhost:8080/#) in <lambda>(rs)
    396     def _post_retry_check_actions(self, retry_state: "RetryCallState") -> None:
    397         if not (self.iter_state.is_explicit_retry or self.iter_state.retry_run_result):
--> 398             self._add_action_func(lambda rs: rs.outcome.result())
    399             return
    400 

[/usr/lib/python3.10/concurrent/futures/_base.py](https://localhost:8080/#) in result(self, timeout)
    449                     raise CancelledError()
    450                 elif self._state == FINISHED:
--> 451                     return self.__get_result()
    452 
    453                 self._condition.wait(timeout)

[/usr/lib/python3.10/concurrent/futures/_base.py](https://localhost:8080/#) in __get_result(self)
    401         if self._exception:
    402             try:
--> 403                 raise self._exception
    404             finally:
    405                 # Break a reference cycle with the exception in self._exception

[/usr/local/lib/python3.10/dist-packages/tenacity/__init__.py](https://localhost:8080/#) in __call__(self, fn, *args, **kwargs)
    476             if isinstance(do, DoAttempt):
    477                 try:
--> 478                     result = fn(*args, **kwargs)
    479                 except BaseException:  # noqa: B902
    480                     retry_state.set_exception(sys.exc_info())  # type: ignore[arg-type]

[/usr/local/lib/python3.10/dist-packages/langchain_community/chat_models/litellm.py](https://localhost:8080/#) in _completion_with_retry(**kwargs)
    288         @retry_decorator
    289         def _completion_with_retry(**kwargs: Any) -> Any:
--> 290             return self.client.completion(**kwargs)
    291 
    292         return _completion_with_retry(**kwargs)

[/usr/local/lib/python3.10/dist-packages/litellm/utils.py](https://localhost:8080/#) in wrapper(*args, **kwargs)
    958                     e, traceback_exception, start_time, end_time
    959                 )  # DO NOT MAKE THREADED - router retry fallback relies on this!
--> 960             raise e
    961 
    962     @wraps(original_function)

[/usr/local/lib/python3.10/dist-packages/litellm/utils.py](https://localhost:8080/#) in wrapper(*args, **kwargs)
    847                     print_verbose(f"Error while checking max token limit: {str(e)}")
    848             # MODEL CALL
--> 849             result = original_function(*args, **kwargs)
    850             end_time = datetime.datetime.now()
    851             if "stream" in kwargs and kwargs["stream"] is True:

[/usr/local/lib/python3.10/dist-packages/litellm/main.py](https://localhost:8080/#) in completion(model, messages, timeout, temperature, top_p, n, stream, stream_options, stop, max_completion_tokens, max_tokens, modalities, prediction, audio, presence_penalty, frequency_penalty, logit_bias, user, response_format, seed, tools, tool_choice, logprobs, top_logprobs, parallel_tool_calls, deployment_id, extra_headers, functions, function_call, base_url, api_version, api_key, model_list, **kwargs)
   3058     except Exception as e:
   3059         ## Map to OpenAI Exception
-> 3060         raise exception_type(
   3061             model=model,
   3062             custom_llm_provider=custom_llm_provider,

[/usr/local/lib/python3.10/dist-packages/litellm/litellm_core_utils/exception_mapping_utils.py](https://localhost:8080/#) in exception_type(model, original_exception, custom_llm_provider, completion_kwargs, extra_kwargs)
   2134         if exception_mapping_worked:
   2135             setattr(e, "litellm_response_headers", litellm_response_headers)
-> 2136             raise e
   2137         else:
   2138             for error_type in litellm.LITELLM_EXCEPTION_TYPES:

[/usr/local/lib/python3.10/dist-packages/litellm/litellm_core_utils/exception_mapping_utils.py](https://localhost:8080/#) in exception_type(model, original_exception, custom_llm_provider, completion_kwargs, extra_kwargs)
    280                 ):
    281                     exception_mapping_worked = True
--> 282                     raise BadRequestError(
    283                         message=f"{exception_provider} - {message}",
    284                         llm_provider=custom_llm_provider,

BadRequestError: litellm.BadRequestError: OpenAIException - Error code: 400 - {'error': {'message': "Invalid value: 'any'. Supported values are: 'none', 'auto', and 'required'.", 'type': 'invalid_request_error', 'param': 'tool_choice', 'code': 'invalid_value'}}

Description

I am trying to use with_structured_output with ChatLiteLLM for OpenAI models. However, it throws an exception. I believe the error is coming from this line. My exception is that it should work and the model outputs structured output.

I have also tried with anthropic models with ChatLiteLLM and it works.

System Info

System Information

OS: Linux
OS Version: #1 SMP PREEMPT_DYNAMIC Thu Jun 27 21:05:47 UTC 2024
Python Version: 3.10.12 (main, Sep 11 2024, 15:47:36) [GCC 11.4.0]

Package Information

langchain_core: 0.3.17
langchain: 0.3.7
langchain_community: 0.3.7
langsmith: 0.1.142
langchain_text_splitters: 0.3.2

Optional packages not installed

langgraph
langserve

Other Dependencies

aiohttp: 3.10.10
async-timeout: 4.0.3
dataclasses-json: 0.6.7
httpx: 0.27.2
httpx-sse: 0.4.0
jsonpatch: 1.33
numpy: 1.26.4
orjson: 3.10.11
packaging: 24.2
pydantic: 2.9.2
pydantic-settings: 2.6.1
PyYAML: 6.0.2
requests: 2.32.3
requests-toolbelt: 1.0.0
SQLAlchemy: 2.0.35
tenacity: 9.0.0
typing-extensions: 4.12.2

@ShawnLJW
Copy link
Contributor

Similar problem with ChatDeepInfra. It doesn't error but tools will not get called. @keenborder786 i see that you opened a PR for this, could u also look at ChatDeepInfra once ChatLiteLLM is approved? should be the same fix

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants