[Bug]: Pydantic error breaks MultimodalConversableAgent #383

davorrunje · 2025-01-08T11:25:35Z

Describe the bug

I got an pydantic error for 0.6.1 but not for 0.6.0 which breaks MultimodalConversableAgent. __

Steps to reproduce

import autogen
from autogen.agentchat.contrib.multimodal_conversable_agent import MultimodalConversableAgent

config_list_4v = autogen.config_list_from_json(
    "../OAI_CONFIG_LIST",
    filter_dict={
        "model": ["gpt-4o"],
    },
)

image2table_convertor = MultimodalConversableAgent(
    name="image2table_convertor",
    system_message="""
    You are an image to table convertor. You will receive an image of tables. The original table could be in csv, pdf or other format. 
    You need to the following step in sequence,
    1. extract the table content and structure. 
    2. Make sure the structure is complete.
    3. Correct typos in the text fields.
    4. In the end, output the table in Markdown.
    """,
    llm_config={"config_list": config_list_4v, "max_tokens": 300},
    human_input_mode="NEVER",
    max_consecutive_auto_reply=1,
)

user_proxy = UserProxyAgent(
    name="User_proxy",
    system_message="A human admin.",
    human_input_mode="NEVER",  # Try between ALWAYS or NEVER
    code_execution_config=False
)

# Ask the question with an image
chat_result = user_proxy.initiate_chat(
    image2table_convertor,
    message="""Please extract table from the following image and convert it to Markdown. 
    <img /workspaces/ag2/notebook/agentchat_pdf_rag/parsed_pdf_info/table-52-17.jpg>.""",
)

Model Used

No response

Expected Behavior

No response

Screenshots and logs

---------------------------------------------------------------------------
ValidationError                           Traceback (most recent call last)
Cell In[6], line 34
     26 user_proxy = UserProxyAgent(
     27     name="User_proxy",
     28     system_message="A human admin.",
     29     human_input_mode="NEVER",  # Try between ALWAYS or NEVER
     30     code_execution_config=False
     31 )
     33 # Ask the question with an image
---> 34 chat_result = user_proxy.initiate_chat(
     35     image2table_convertor,
     36     message="""Please extract table from the following image and convert it to Markdown. 
     37     <img /workspaces/ag2/notebook/agentchat_pdf_rag/parsed_pdf_info/table-52-17.jpg>.""",
     38 )

File /workspaces/ag2/autogen/agentchat/conversable_agent.py:1117, in ConversableAgent.initiate_chat(self, recipient, clear_history, silent, cache, max_turns, summary_method, summary_args, message, **kwargs)
   1115     else:
   1116         msg2send = self.generate_init_message(message, **kwargs)
-> 1117     self.send(msg2send, recipient, silent=silent)
   1118 summary = self._summarize_chat(
   1119     summary_method,
   1120     summary_args,
   1121     recipient,
   1122     cache=cache,
   1123 )
   1124 for agent in [self, recipient]:

File /workspaces/ag2/autogen/agentchat/conversable_agent.py:807, in ConversableAgent.send(self, message, recipient, request_reply, silent)
    805 valid = self._append_oai_message(message, "assistant", recipient, is_sending=True)
    806 if valid:
--> 807     recipient.receive(message, self, request_reply, silent)
    808 else:
    809     raise ValueError(
    810         "Message can't be converted into a valid ChatCompletion message. Either content or function_call must be provided."
    811     )

File /workspaces/ag2/autogen/agentchat/conversable_agent.py:914, in ConversableAgent.receive(self, message, sender, request_reply, silent)
    884 def receive(
    885     self,
    886     message: Union[dict, str],
   (...)
    889     silent: Optional[bool] = False,
    890 ):
    891     """Receive a message from another agent.
    892 
    893     Once a message is received, this function sends a reply to the sender or stop.
   (...)
    912         ValueError: if the message can't be converted into a valid ChatCompletion message.
    913     """
--> 914     self._process_received_message(message, sender, silent)
    915     if request_reply is False or request_reply is None and self.reply_at_receive[sender] is False:
    916         return

File /workspaces/ag2/autogen/agentchat/conversable_agent.py:882, in ConversableAgent._process_received_message(self, message, sender, silent)
    877     raise ValueError(
    878         "Received message can't be converted into a valid ChatCompletion message. Either content or function_call must be provided."
    879     )
    881 if not ConversableAgent._is_silent(sender, silent):
--> 882     self._print_received_message(message, sender)

File /workspaces/ag2/autogen/agentchat/conversable_agent.py:865, in ConversableAgent._print_received_message(self, message, sender, skip_head)
    863 def _print_received_message(self, message: Union[dict, str], sender: Agent, skip_head: bool = False):
    864     message = self._message_to_dict(message)
--> 865     message_model = create_received_message_model(message=message, sender=sender, recipient=self)
    866     iostream = IOStream.get_default()
    867     # message_model.print(iostream.print)

File /workspaces/ag2/autogen/messages/agent_messages.py:246, in create_received_message_model(uuid, message, sender, recipient)
    239 if content is not None and "context" in message:
    240     content = OpenAIWrapper.instantiate(
    241         content,  # type: ignore [arg-type]
    242         message["context"],
    243         allow_format_str_template,
    244     )
--> 246 return TextMessage(
    247     content=content,
    248     sender_name=sender.name,
    249     recipient_name=recipient.name,
    250     uuid=uuid,
    251 )

File /workspaces/ag2/autogen/messages/base_message.py:67, in wrap_message.<locals>.WrapperBase.__init__(self, *args, **data)
     65 if "content" in data:
     66     content = data.pop("content")
---> 67     super().__init__(*args, content=message_cls(*args, **data, content=content), **data)
     68 else:
     69     super().__init__(content=message_cls(*args, **data), **data)

File /workspaces/ag2/autogen/messages/base_message.py:22, in BaseMessage.__init__(self, uuid, **kwargs)
     20 def __init__(self, uuid: Optional[UUID] = None, **kwargs: Any) -> None:
     21     uuid = uuid or uuid4()
---> 22     super().__init__(uuid=uuid, **kwargs)

File ~/.local/lib/python3.11/site-packages/pydantic/main.py:212, in BaseModel.__init__(self, **data)
    210 # `__tracebackhide__` tells pytest and some other tools to omit this function from tracebacks
    211 __tracebackhide__ = True
--> 212 validated_self = self.__pydantic_validator__.validate_python(data, self_instance=self)
    213 if self is not validated_self:
    214     warnings.warn(
... (19 lines left)

Additional Information

No response

The text was updated successfully, but these errors were encountered:

rjambrecic · 2025-01-08T14:24:00Z

I followed your steps in a fresh virtual environment using the following commands:

python3.10 -m venv .venv
source .venv/bin/activate
pip install "ag2[lmm]==0.6.1"

After setting up the environment, I ran your code snippet, but I was unable to reproduce the error. Everything executed without any issues on my end.

To help us identify the issue, could you share the output of the following command:

pip list

This will give us a clear view of the versions of pydantic and other dependencies installed, which might be contributing to the behavior you're observing.

AgentGenie · 2025-01-09T18:25:18Z

Somehow I can't reproduce it from ag2==0.6.1 but able to reproduce it from ag2==0.7.0
error log
error_log.txt
pip list
pip_list_issue383.txt

from autogen import UserProxyAgent, config_list_from_json
from autogen.agentchat.contrib.multimodal_conversable_agent import MultimodalConversableAgent
from autogen.agentchat.contrib.multimodal_conversable_agent import MultimodalConversableAgent

config_list_4v = config_list_from_json(
    "../OAI_CONFIG_LIST",
    filter_dict={
        "model": ["gpt-4o"],
    },
)

image2table_convertor = MultimodalConversableAgent(
    name="image2table_convertor",
    system_message="""
    You are an image to table convertor. You will receive an image of tables. The original table could be in csv, pdf or other format. 
    You need to the following step in sequence,
    1. extract the table content and structure. 
    2. Make sure the structure is complete.
    3. Correct typos in the text fields.
    4. In the end, output the table in Markdown.
    """,
    llm_config={"config_list": config_list_4v, "max_tokens": 300},
    human_input_mode="NEVER",
    max_consecutive_auto_reply=1,
)

user_proxy = UserProxyAgent(
    name="User_proxy",
    system_message="A human admin.",
    human_input_mode="NEVER",  # Try between ALWAYS or NEVER
    code_execution_config=False
)

# Ask the question with an image
chat_result = user_proxy.initiate_chat(
    image2table_convertor,
    message="""Please extract table from the following image and convert it to Markdown. 
    <img /workspaces/ag2/notebook/agentchat_pdf_rag/parsed_pdf_info/table-52-17.jpg>.""",
)

rjambrecic · 2025-01-10T15:21:44Z

Hi @AgentGenie ,

the issue has been resolved in PR.

AgentGenie · 2025-01-11T04:06:00Z

Thanks @rjambrecic

davorrunje added the bug Something isn't working label Jan 8, 2025

davorrunje assigned rjambrecic Jan 8, 2025

davorrunje added this to ag2 Jan 8, 2025

davorrunje moved this to Todo in ag2 Jan 8, 2025

davorrunje moved this from Todo to In Progress in ag2 Jan 8, 2025

rjambrecic mentioned this issue Jan 10, 2025

content parameter of TextMessage class can also be list #432

Merged

3 tasks

davorrunje moved this from In Progress to Waiting for merge in ag2 Jan 10, 2025

davorrunje closed this as completed in #432 Jan 10, 2025

github-project-automation bot moved this from Waiting for merge to Done in ag2 Jan 10, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: Pydantic error breaks MultimodalConversableAgent #383

[Bug]: Pydantic error breaks MultimodalConversableAgent #383

davorrunje commented Jan 8, 2025

rjambrecic commented Jan 8, 2025 •

edited

Loading

AgentGenie commented Jan 9, 2025 •

edited

Loading

rjambrecic commented Jan 10, 2025 •

edited

Loading

AgentGenie commented Jan 11, 2025

[Bug]: Pydantic error breaks MultimodalConversableAgent #383

[Bug]: Pydantic error breaks MultimodalConversableAgent #383

Comments

davorrunje commented Jan 8, 2025

Describe the bug

Steps to reproduce

Model Used

Expected Behavior

Screenshots and logs

Additional Information

rjambrecic commented Jan 8, 2025 • edited Loading

AgentGenie commented Jan 9, 2025 • edited Loading

rjambrecic commented Jan 10, 2025 • edited Loading

AgentGenie commented Jan 11, 2025

rjambrecic commented Jan 8, 2025 •

edited

Loading

AgentGenie commented Jan 9, 2025 •

edited

Loading

rjambrecic commented Jan 10, 2025 •

edited

Loading