diff --git a/.github/PULL_REQUEST_TEMPLATE.md b/.github/PULL_REQUEST_TEMPLATE.md index 9f064605c5..1893c28239 100644 --- a/.github/PULL_REQUEST_TEMPLATE.md +++ b/.github/PULL_REQUEST_TEMPLATE.md @@ -12,6 +12,6 @@ ## Checks -- [ ] I've included any doc changes needed for https://ag2ai.github.io/autogen/. See https://ag2ai.github.io/ag2/docs/Contribute#documentation to build and test documentation locally. +- [ ] I've included any doc changes needed for https://ag2ai.github.io/ag2/. See https://ag2ai.github.io/ag2/docs/Contribute#documentation to build and test documentation locally. - [ ] I've added tests (if relevant) corresponding to the changes introduced in this PR. - [ ] I've made sure all auto checks have passed. diff --git a/README.md b/README.md index b5897b728c..396e669d1e 100644 --- a/README.md +++ b/README.md @@ -84,7 +84,7 @@ We adopt the Apache 2.0 license from v0.3. This enhances our commitment to open- ## What is AG2 -AG2 (formally AutoGen) is an open-source programming framework for building AI agents and facilitating cooperation among multiple agents to solve tasks. AG2 aims to streamline the development and research of agentic AI, much like PyTorch does for Deep Learning. It offers features such as agents capable of interacting with each other, facilitates the use of various large language models (LLMs) and tool use support, autonomous and human-in-the-loop workflows, and multi-agent conversation patterns. +AG2 (formerly AutoGen) is an open-source programming framework for building AI agents and facilitating cooperation among multiple agents to solve tasks. AG2 aims to streamline the development and research of agentic AI, much like PyTorch does for Deep Learning. It offers features such as agents capable of interacting with each other, facilitates the use of various large language models (LLMs) and tool use support, autonomous and human-in-the-loop workflows, and multi-agent conversation patterns. **Open Source Statement**: The project welcomes contributions from developers and organizations worldwide. Our goal is to foster a collaborative and inclusive community where diverse perspectives and expertise can drive innovation and enhance the project's capabilities. Whether you are an individual contributor or represent an organization, we invite you to join us in shaping the future of this project. Together, we can build something truly remarkable. @@ -335,7 +335,7 @@ Explore detailed implementations with sample code and applications to help you g ## License This project is licensed under the [Apache License, Version 2.0 (Apache-2.0)](./LICENSE). -This project is a spin-off of https://github.com/ag2ai/ag2 and contains code under two licenses: +This project is a spin-off of [AutoGen](https://github.com/microsoft/autogen) and contains code under two licenses: - The original code from https://github.com/microsoft/autogen is licensed under the MIT License. See the [LICENSE_original_MIT](./license_original/LICENSE_original_MIT) file for details. diff --git a/autogen/agentchat/contrib/agent_eval/README.md b/autogen/agentchat/contrib/agent_eval/README.md index cd05199aa1..b9a9815e2d 100644 --- a/autogen/agentchat/contrib/agent_eval/README.md +++ b/autogen/agentchat/contrib/agent_eval/README.md @@ -1,9 +1,7 @@ -Agents for running the [AgentEval](https://ag2ai.github.io/autogen/blog/2023/11/20/AgentEval/) pipeline. +Agents for running the [AgentEval](https://ag2ai.github.io/ag2/blog/2023/11/20/AgentEval/) pipeline. AgentEval is a process for evaluating a LLM-based system's performance on a given task. When given a task to evaluate and a few example runs, the critic and subcritic agents create evaluation criteria for evaluating a system's solution. Once the criteria has been created, the quantifier agent can evaluate subsequent task solutions based on the generated criteria. -For more information see: [AgentEval Integration Roadmap](https://github.com/microsoft/autogen/issues/2162) - -See our [blog post](https://ag2ai.github.io/autogen/blog/2024/06/21/AgentEval) for usage examples and general explanations. +See our [blog post](https://ag2ai.github.io/ag2/blog/2024/06/21/AgentEval) for usage examples and general explanations. diff --git a/notebook/JSON_mode_example.ipynb b/notebook/JSON_mode_example.ipynb index 0e8d65d213..3dd6f7510b 100644 --- a/notebook/JSON_mode_example.ipynb +++ b/notebook/JSON_mode_example.ipynb @@ -19,7 +19,7 @@ "\n", "\n", "Please find documentation about this feature in OpenAI [here](https://platform.openai.com/docs/guides/text-generation/json-mode).\n", - "More information about Agent Descriptions is located [here](https://ag2ai.github.io/autogen/blog/2023/12/29/AgentDescriptions/)\n", + "More information about Agent Descriptions is located [here](https://ag2ai.github.io/ag2/blog/2023/12/29/AgentDescriptions/)\n", "\n", "Benefits\n", "- This contribution provides a method to implement precise speaker transitions based on content of the input message. The example can prevent Prompt hacks that use coersive language.\n", diff --git a/notebook/agentchat_MathChat.ipynb b/notebook/agentchat_MathChat.ipynb index 5c6fcc30c8..0bafb6606b 100644 --- a/notebook/agentchat_MathChat.ipynb +++ b/notebook/agentchat_MathChat.ipynb @@ -9,7 +9,7 @@ "\n", "AutoGen offers conversable agents powered by LLM, tool or human, which can be used to perform tasks collectively via automated chat. This framework allows tool use and human participation through multi-agent conversation. Please find documentation about this feature [here](https://ag2ai.github.io/ag2/docs/Use-Cases/agent_chat).\n", "\n", - "MathChat is an experimental conversational framework for math problem solving. In this notebook, we demonstrate how to use MathChat to solve math problems. MathChat uses the `AssistantAgent` and `MathUserProxyAgent`, which is similar to the usage of `AssistantAgent` and `UserProxyAgent` in other notebooks (e.g., [Automated Task Solving with Code Generation, Execution & Debugging](https://github.com/ag2ai/ag2/blob/main/notebook/agentchat_auto_feedback_from_code_execution.ipynb)). Essentially, `MathUserProxyAgent` implements a different auto reply mechanism corresponding to the MathChat prompts. You can find more details in the paper [An Empirical Study on Challenging Math Problem Solving with GPT-4](https://arxiv.org/abs/2306.01337) or the [blogpost](https://ag2ai.github.io/autogen/blog/2023/06/28/MathChat).\n", + "MathChat is an experimental conversational framework for math problem solving. In this notebook, we demonstrate how to use MathChat to solve math problems. MathChat uses the `AssistantAgent` and `MathUserProxyAgent`, which is similar to the usage of `AssistantAgent` and `UserProxyAgent` in other notebooks (e.g., [Automated Task Solving with Code Generation, Execution & Debugging](https://github.com/ag2ai/ag2/blob/main/notebook/agentchat_auto_feedback_from_code_execution.ipynb)). Essentially, `MathUserProxyAgent` implements a different auto reply mechanism corresponding to the MathChat prompts. You can find more details in the paper [An Empirical Study on Challenging Math Problem Solving with GPT-4](https://arxiv.org/abs/2306.01337) or the [blogpost](https://ag2ai.github.io/ag2/blog/2023/06/28/MathChat).\n", "\n", "````{=mdx}\n", ":::info Requirements\n", diff --git a/notebook/agentchat_cost_token_tracking.ipynb b/notebook/agentchat_cost_token_tracking.ipynb index 92ee8d1b97..297987b6ab 100644 --- a/notebook/agentchat_cost_token_tracking.ipynb +++ b/notebook/agentchat_cost_token_tracking.ipynb @@ -58,7 +58,7 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 4, "metadata": {}, "outputs": [], "source": [ @@ -68,7 +68,7 @@ "config_list = autogen.config_list_from_json(\n", " \"OAI_CONFIG_LIST\",\n", " filter_dict={\n", - " \"model\": [\"gpt-3.5-turbo\", \"gpt-3.5-turbo-16k\"], # comment out to get all\n", + " \"tags\": [\"gpt-4o\", \"gpt-4o-mini\"], # comment out to get all\n", " },\n", ")" ] @@ -83,17 +83,17 @@ "```python\n", "config_list = [\n", " {\n", - " \"model\": \"gpt-3.5-turbo\",\n", + " \"model\": \"gpt-4o\",\n", " \"api_key\": \"\",\n", - " \"tags\": [\"gpt-3.5-turbo\"],\n", - " }, # OpenAI API endpoint for gpt-3.5-turbo\n", + " \"tags\": [\"gpt-4o\"],\n", + " }, # OpenAI API endpoint for gpt-4o\n", " {\n", - " \"model\": \"gpt-35-turbo-0613\", # 0613 or newer is needed to use functions\n", + " \"model\": \"gpt-4o-mini\",\n", " \"base_url\": \"\", \n", " \"api_type\": \"azure\", \n", - " \"api_version\": \"2024-02-01\", # 2023-07-01-preview or newer is needed to use functions\n", + " \"api_version\": \"2024-07-18\",\n", " \"api_key\": \"\",\n", - " \"tags\": [\"gpt-3.5-turbo\", \"0613\"],\n", + " \"tags\": [\"gpt-4o-mini\", \"20240718\"],\n", " }\n", "]\n", "```\n", @@ -110,14 +110,14 @@ }, { "cell_type": "code", - "execution_count": 2, + "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ - "0.00020600000000000002\n" + "0.0011125\n" ] } ], @@ -139,14 +139,14 @@ }, { "cell_type": "code", - "execution_count": 7, + "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ - "Price: 109\n" + "Price: 0.144\n" ] } ], @@ -177,7 +177,7 @@ }, { "cell_type": "code", - "execution_count": 3, + "execution_count": 7, "metadata": {}, "outputs": [ { @@ -198,7 +198,7 @@ }, { "cell_type": "code", - "execution_count": 4, + "execution_count": 8, "metadata": {}, "outputs": [ { @@ -207,20 +207,20 @@ "text": [ "----------------------------------------------------------------------------------------------------\n", "Usage summary excluding cached usage: \n", - "Total cost: 0.00023\n", - "* Model 'gpt-35-turbo': cost: 0.00023, prompt_tokens: 25, completion_tokens: 142, total_tokens: 167\n", + "Total cost: 0.154\n", + "* Model 'gpt-4o-2024-08-06': cost: 0.154, prompt_tokens: 25, completion_tokens: 129, total_tokens: 154\n", "\n", "All completions are non-cached: the total cost with cached completions is the same as actual cost.\n", "----------------------------------------------------------------------------------------------------\n", "----------------------------------------------------------------------------------------------------\n", "Usage summary excluding cached usage: \n", - "Total cost: 0.00023\n", - "* Model 'gpt-35-turbo': cost: 0.00023, prompt_tokens: 25, completion_tokens: 142, total_tokens: 167\n", + "Total cost: 0.154\n", + "* Model 'gpt-4o-2024-08-06': cost: 0.154, prompt_tokens: 25, completion_tokens: 129, total_tokens: 154\n", "----------------------------------------------------------------------------------------------------\n", "----------------------------------------------------------------------------------------------------\n", "Usage summary including cached usage: \n", - "Total cost: 0.00023\n", - "* Model 'gpt-35-turbo': cost: 0.00023, prompt_tokens: 25, completion_tokens: 142, total_tokens: 167\n", + "Total cost: 0.154\n", + "* Model 'gpt-4o-2024-08-06': cost: 0.154, prompt_tokens: 25, completion_tokens: 129, total_tokens: 154\n", "----------------------------------------------------------------------------------------------------\n" ] } @@ -236,15 +236,15 @@ }, { "cell_type": "code", - "execution_count": 5, + "execution_count": 9, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ - "{'total_cost': 0.0002255, 'gpt-35-turbo': {'cost': 0.0002255, 'prompt_tokens': 25, 'completion_tokens': 142, 'total_tokens': 167}}\n", - "{'total_cost': 0.0002255, 'gpt-35-turbo': {'cost': 0.0002255, 'prompt_tokens': 25, 'completion_tokens': 142, 'total_tokens': 167}}\n" + "{'total_cost': 0.154, 'gpt-4o-2024-08-06': {'cost': 0.154, 'prompt_tokens': 25, 'completion_tokens': 129, 'total_tokens': 154}}\n", + "{'total_cost': 0.154, 'gpt-4o-2024-08-06': {'cost': 0.154, 'prompt_tokens': 25, 'completion_tokens': 129, 'total_tokens': 154}}\n" ] } ], @@ -256,7 +256,7 @@ }, { "cell_type": "code", - "execution_count": 6, + "execution_count": 10, "metadata": {}, "outputs": [ { @@ -265,12 +265,12 @@ "text": [ "----------------------------------------------------------------------------------------------------\n", "Usage summary excluding cached usage: \n", - "Total cost: 0.00023\n", - "* Model 'gpt-35-turbo': cost: 0.00023, prompt_tokens: 25, completion_tokens: 142, total_tokens: 167\n", + "Total cost: 0.154\n", + "* Model 'gpt-4o-2024-08-06': cost: 0.154, prompt_tokens: 25, completion_tokens: 129, total_tokens: 154\n", "\n", "Usage summary including cached usage: \n", - "Total cost: 0.00045\n", - "* Model 'gpt-35-turbo': cost: 0.00045, prompt_tokens: 50, completion_tokens: 284, total_tokens: 334\n", + "Total cost: 0.308\n", + "* Model 'gpt-4o-2024-08-06': cost: 0.308, prompt_tokens: 50, completion_tokens: 258, total_tokens: 308\n", "----------------------------------------------------------------------------------------------------\n" ] } @@ -284,7 +284,7 @@ }, { "cell_type": "code", - "execution_count": 7, + "execution_count": 11, "metadata": {}, "outputs": [ { @@ -303,7 +303,7 @@ }, { "cell_type": "code", - "execution_count": 8, + "execution_count": 12, "metadata": {}, "outputs": [ { @@ -314,8 +314,8 @@ "No actual cost incurred (all completions are using cache).\n", "\n", "Usage summary including cached usage: \n", - "Total cost: 0.00023\n", - "* Model 'gpt-35-turbo': cost: 0.00023, prompt_tokens: 25, completion_tokens: 142, total_tokens: 167\n", + "Total cost: 0.154\n", + "* Model 'gpt-4o-2024-08-06': cost: 0.154, prompt_tokens: 25, completion_tokens: 129, total_tokens: 154\n", "----------------------------------------------------------------------------------------------------\n" ] } @@ -340,7 +340,7 @@ }, { "cell_type": "code", - "execution_count": 9, + "execution_count": 13, "metadata": {}, "outputs": [ { @@ -354,19 +354,57 @@ "--------------------------------------------------------------------------------\n", "\u001b[33massistant\u001b[0m (to ai_user):\n", "\n", - "To find x, we need to take the cube root of 125. The cube root of a number is the number that, when multiplied by itself three times, gives the original number.\n", + "To solve the equation \\(x^3 = 125\\), you need to find the value of \\(x\\) that makes this equation true. \n", + "\n", + "You can solve for \\(x\\) by taking the cube root of both sides of the equation:\n", + "\n", + "\\[\n", + "x = \\sqrt[3]{125}\n", + "\\]\n", "\n", - "In this case, the cube root of 125 is 5 since 5 * 5 * 5 = 125. Therefore, x = 5.\n", + "Since \\(125\\) is \\(5^3\\), the cube root of \\(125\\) is \\(5\\). Thus,\n", + "\n", + "\\[\n", + "x = 5\n", + "\\]\n", + "\n", + "Therefore, the solution to the equation \\(x^3 = 125\\) is \\(x = 5\\).\n", "\n", "--------------------------------------------------------------------------------\n", "\u001b[33mai_user\u001b[0m (to assistant):\n", "\n", - "That's correct! Well done. The value of x is indeed 5, as you correctly found by taking the cube root of 125. Keep up the good work!\n", + "Can you help me solve the equation \\(2x^2 - 8x = 0\\)?\n", "\n", "--------------------------------------------------------------------------------\n", "\u001b[33massistant\u001b[0m (to ai_user):\n", "\n", - "Thank you! I'm glad I could help. If you have any more questions, feel free to ask!\n", + "Certainly! To solve the equation \\(2x^2 - 8x = 0\\), you can start by factoring the expression on the left-hand side.\n", + "\n", + "First, factor out the greatest common factor, which is \\(2x\\):\n", + "\n", + "\\[\n", + "2x(x - 4) = 0\n", + "\\]\n", + "\n", + "Now, you have a product of two factors equal to zero. According to the zero product property, if the product of two factors is zero, at least one of the factors must be zero. So, you set each factor equal to zero and solve for \\(x\\):\n", + "\n", + "1. \\(2x = 0\\)\n", + "\n", + " Divide both sides by 2 to solve for \\(x\\):\n", + "\n", + " \\[\n", + " x = 0\n", + " \\]\n", + "\n", + "2. \\(x - 4 = 0\\)\n", + "\n", + " Add 4 to both sides to solve for \\(x\\):\n", + "\n", + " \\[\n", + " x = 4\n", + " \\]\n", + "\n", + "So, the solutions to the equation \\(2x^2 - 8x = 0\\) are \\(x = 0\\) and \\(x = 4\\).\n", "\n", "--------------------------------------------------------------------------------\n" ] @@ -374,10 +412,10 @@ { "data": { "text/plain": [ - "ChatResult(chat_id=None, chat_history=[{'content': '$x^3=125$. What is x?', 'role': 'assistant'}, {'content': 'To find x, we need to take the cube root of 125. The cube root of a number is the number that, when multiplied by itself three times, gives the original number.\\n\\nIn this case, the cube root of 125 is 5 since 5 * 5 * 5 = 125. Therefore, x = 5.', 'role': 'user'}, {'content': \"That's correct! Well done. The value of x is indeed 5, as you correctly found by taking the cube root of 125. Keep up the good work!\", 'role': 'assistant'}, {'content': \"Thank you! I'm glad I could help. If you have any more questions, feel free to ask!\", 'role': 'user'}], summary=\"Thank you! I'm glad I could help. If you have any more questions, feel free to ask!\", cost={'usage_including_cached_inference': {'total_cost': 0.000333, 'gpt-35-turbo': {'cost': 0.000333, 'prompt_tokens': 282, 'completion_tokens': 128, 'total_tokens': 410}}, 'usage_excluding_cached_inference': {'total_cost': 0.000333, 'gpt-35-turbo': {'cost': 0.000333, 'prompt_tokens': 282, 'completion_tokens': 128, 'total_tokens': 410}}}, human_input=[])" + "ChatResult(chat_id=None, chat_history=[{'content': '$x^3=125$. What is x?', 'role': 'assistant', 'name': 'ai_user'}, {'content': 'To solve the equation \\\\(x^3 = 125\\\\), you need to find the value of \\\\(x\\\\) that makes this equation true. \\n\\nYou can solve for \\\\(x\\\\) by taking the cube root of both sides of the equation:\\n\\n\\\\[\\nx = \\\\sqrt[3]{125}\\n\\\\]\\n\\nSince \\\\(125\\\\) is \\\\(5^3\\\\), the cube root of \\\\(125\\\\) is \\\\(5\\\\). Thus,\\n\\n\\\\[\\nx = 5\\n\\\\]\\n\\nTherefore, the solution to the equation \\\\(x^3 = 125\\\\) is \\\\(x = 5\\\\).', 'role': 'user', 'name': 'assistant'}, {'content': 'Can you help me solve the equation \\\\(2x^2 - 8x = 0\\\\)?', 'role': 'assistant', 'name': 'ai_user'}, {'content': 'Certainly! To solve the equation \\\\(2x^2 - 8x = 0\\\\), you can start by factoring the expression on the left-hand side.\\n\\nFirst, factor out the greatest common factor, which is \\\\(2x\\\\):\\n\\n\\\\[\\n2x(x - 4) = 0\\n\\\\]\\n\\nNow, you have a product of two factors equal to zero. According to the zero product property, if the product of two factors is zero, at least one of the factors must be zero. So, you set each factor equal to zero and solve for \\\\(x\\\\):\\n\\n1. \\\\(2x = 0\\\\)\\n\\n Divide both sides by 2 to solve for \\\\(x\\\\):\\n\\n \\\\[\\n x = 0\\n \\\\]\\n\\n2. \\\\(x - 4 = 0\\\\)\\n\\n Add 4 to both sides to solve for \\\\(x\\\\):\\n\\n \\\\[\\n x = 4\\n \\\\]\\n\\nSo, the solutions to the equation \\\\(2x^2 - 8x = 0\\\\) are \\\\(x = 0\\\\) and \\\\(x = 4\\\\).', 'role': 'user', 'name': 'assistant'}], summary='Certainly! To solve the equation \\\\(2x^2 - 8x = 0\\\\), you can start by factoring the expression on the left-hand side.\\n\\nFirst, factor out the greatest common factor, which is \\\\(2x\\\\):\\n\\n\\\\[\\n2x(x - 4) = 0\\n\\\\]\\n\\nNow, you have a product of two factors equal to zero. According to the zero product property, if the product of two factors is zero, at least one of the factors must be zero. So, you set each factor equal to zero and solve for \\\\(x\\\\):\\n\\n1. \\\\(2x = 0\\\\)\\n\\n Divide both sides by 2 to solve for \\\\(x\\\\):\\n\\n \\\\[\\n x = 0\\n \\\\]\\n\\n2. \\\\(x - 4 = 0\\\\)\\n\\n Add 4 to both sides to solve for \\\\(x\\\\):\\n\\n \\\\[\\n x = 4\\n \\\\]\\n\\nSo, the solutions to the equation \\\\(2x^2 - 8x = 0\\\\) are \\\\(x = 0\\\\) and \\\\(x = 4\\\\).', cost={'usage_including_cached_inference': {'total_cost': 0.7649999999999999, 'gpt-4o-2024-08-06': {'cost': 0.7649999999999999, 'prompt_tokens': 390, 'completion_tokens': 375, 'total_tokens': 765}}, 'usage_excluding_cached_inference': {'total_cost': 0.7649999999999999, 'gpt-4o-2024-08-06': {'cost': 0.7649999999999999, 'prompt_tokens': 390, 'completion_tokens': 375, 'total_tokens': 765}}}, human_input=[])" ] }, - "execution_count": 9, + "execution_count": 13, "metadata": {}, "output_type": "execute_result" } @@ -415,7 +453,7 @@ }, { "cell_type": "code", - "execution_count": 10, + "execution_count": 14, "metadata": {}, "outputs": [ { @@ -425,8 +463,8 @@ "Agent 'ai_user':\n", "----------------------------------------------------------------------------------------------------\n", "Usage summary excluding cached usage: \n", - "Total cost: 0.00011\n", - "* Model 'gpt-35-turbo': cost: 0.00011, prompt_tokens: 114, completion_tokens: 35, total_tokens: 149\n", + "Total cost: 0.193\n", + "* Model 'gpt-4o-2024-08-06': cost: 0.193, prompt_tokens: 172, completion_tokens: 21, total_tokens: 193\n", "\n", "All completions are non-cached: the total cost with cached completions is the same as actual cost.\n", "----------------------------------------------------------------------------------------------------\n", @@ -434,8 +472,8 @@ "Agent 'assistant':\n", "----------------------------------------------------------------------------------------------------\n", "Usage summary excluding cached usage: \n", - "Total cost: 0.00022\n", - "* Model 'gpt-35-turbo': cost: 0.00022, prompt_tokens: 168, completion_tokens: 93, total_tokens: 261\n", + "Total cost: 0.572\n", + "* Model 'gpt-4o-2024-08-06': cost: 0.572, prompt_tokens: 218, completion_tokens: 354, total_tokens: 572\n", "\n", "All completions are non-cached: the total cost with cached completions is the same as actual cost.\n", "----------------------------------------------------------------------------------------------------\n" @@ -450,7 +488,7 @@ }, { "cell_type": "code", - "execution_count": 11, + "execution_count": 15, "metadata": {}, "outputs": [ { @@ -474,17 +512,17 @@ }, { "cell_type": "code", - "execution_count": 12, + "execution_count": 16, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ - "Actual usage summary for assistant (excluding completion from cache): {'total_cost': 0.0002235, 'gpt-35-turbo': {'cost': 0.0002235, 'prompt_tokens': 168, 'completion_tokens': 93, 'total_tokens': 261}}\n", - "Total usage summary for assistant (including completion from cache): {'total_cost': 0.0002235, 'gpt-35-turbo': {'cost': 0.0002235, 'prompt_tokens': 168, 'completion_tokens': 93, 'total_tokens': 261}}\n", - "Actual usage summary for ai_user_proxy: {'total_cost': 0.0001095, 'gpt-35-turbo': {'cost': 0.0001095, 'prompt_tokens': 114, 'completion_tokens': 35, 'total_tokens': 149}}\n", - "Total usage summary for ai_user_proxy: {'total_cost': 0.0001095, 'gpt-35-turbo': {'cost': 0.0001095, 'prompt_tokens': 114, 'completion_tokens': 35, 'total_tokens': 149}}\n", + "Actual usage summary for assistant (excluding completion from cache): {'total_cost': 0.572, 'gpt-4o-2024-08-06': {'cost': 0.572, 'prompt_tokens': 218, 'completion_tokens': 354, 'total_tokens': 572}}\n", + "Total usage summary for assistant (including completion from cache): {'total_cost': 0.572, 'gpt-4o-2024-08-06': {'cost': 0.572, 'prompt_tokens': 218, 'completion_tokens': 354, 'total_tokens': 572}}\n", + "Actual usage summary for ai_user_proxy: {'total_cost': 0.193, 'gpt-4o-2024-08-06': {'cost': 0.193, 'prompt_tokens': 172, 'completion_tokens': 21, 'total_tokens': 193}}\n", + "Total usage summary for ai_user_proxy: {'total_cost': 0.193, 'gpt-4o-2024-08-06': {'cost': 0.193, 'prompt_tokens': 172, 'completion_tokens': 21, 'total_tokens': 193}}\n", "Actual usage summary for user_proxy: None\n", "Total usage summary for user_proxy: None\n" ] @@ -503,20 +541,20 @@ }, { "cell_type": "code", - "execution_count": 13, + "execution_count": 17, "metadata": {}, "outputs": [ { "data": { "text/plain": [ - "{'total_cost': 0.000333,\n", - " 'gpt-35-turbo': {'cost': 0.000333,\n", - " 'prompt_tokens': 282,\n", - " 'completion_tokens': 128,\n", - " 'total_tokens': 410}}" + "{'total_cost': 0.7649999999999999,\n", + " 'gpt-4o-2024-08-06': {'cost': 0.7649999999999999,\n", + " 'prompt_tokens': 390,\n", + " 'completion_tokens': 375,\n", + " 'total_tokens': 765}}" ] }, - "execution_count": 13, + "execution_count": 17, "metadata": {}, "output_type": "execute_result" } @@ -535,7 +573,7 @@ ] }, "kernelspec": { - "display_name": "msft", + "display_name": "Python 3", "language": "python", "name": "python3" }, @@ -549,7 +587,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.9.19" + "version": "3.11.10" } }, "nbformat": 4, diff --git a/notebook/agenteval_cq_math.ipynb b/notebook/agenteval_cq_math.ipynb index e9dc5ca030..21a59ef952 100644 --- a/notebook/agenteval_cq_math.ipynb +++ b/notebook/agenteval_cq_math.ipynb @@ -17,7 +17,7 @@ "\n", "![AgentEval](https://media.githubusercontent.com/media/ag2ai/ag2/main/website/blog/2023-11-20-AgentEval/img/agenteval-CQ.png)\n", "\n", - "For more detailed explanations, please refer to the accompanying [blog post](https://ag2ai.github.io/autogen/blog/2023/11/20/AgentEval)\n", + "For more detailed explanations, please refer to the accompanying [blog post](https://ag2ai.github.io/ag2/blog/2023/11/20/AgentEval)\n", "\n", "## Requirements\n", "\n", diff --git a/notebook/autogen_uniformed_api_calling.ipynb b/notebook/autogen_uniformed_api_calling.ipynb index 58175b31af..aad19ed078 100644 --- a/notebook/autogen_uniformed_api_calling.ipynb +++ b/notebook/autogen_uniformed_api_calling.ipynb @@ -22,7 +22,7 @@ "\n", "... and more to come!\n", "\n", - "You can also [plug in your local deployed LLM](https://ag2ai.github.io/autogen/blog/2024/01/26/Custom-Models) into AutoGen if needed." + "You can also [plug in your local deployed LLM](https://ag2ai.github.io/ag2/blog/2024/01/26/Custom-Models) into AutoGen if needed." ] }, { @@ -376,11 +376,11 @@ ], "metadata": { "front_matter": { - "description": "Uniform interface to call different LLM.", - "tags": [ - "integration", - "custom model" - ] + "description": "Uniform interface to call different LLM.", + "tags": [ + "integration", + "custom model" + ] }, "kernelspec": { "display_name": "autodev", diff --git a/notebook/captainagent_expert_library.json b/notebook/captainagent_expert_library.json new file mode 100644 index 0000000000..f2e67bc583 --- /dev/null +++ b/notebook/captainagent_expert_library.json @@ -0,0 +1,42 @@ +[ + { + "description": "VideoTranscription_Expert is a professional adept at transcribing video content, utilizing NLP for text analysis, programming in Python for task automation, and ensuring transcription accuracy through rigorous verification.", + "name": "VideoTranscription_Expert", + "system_message": "## Your role\nVideoTranscription_Expert is a dedicated professional highly skilled in transcribing spoken words in videos, analyzing video content, applying natural language processing (NLP) methods for text analysis, writing efficient Python code for transcription tasks, and meticulously verifying the accuracy of transcription results.\n\n## Task and skill instructions\n- As an expert in video transcription, VideoTranscription_Expert is tasked with converting audio content from video files into written text with high precision, considering the nuances of language, accents, dialects, and context. This requires keen listening skills and attention to detail to ensure that every spoken word is captured accurately.\n- In the field of natural language processing, VideoTranscription_Expert is proficient in utilizing advanced algorithms and computational techniques to process and analyze the transcribed text, facilitate language understanding, and extract meaning or patterns from the data. This includes skills in text mining, sentiment analysis, and language modeling.\n- VideoTranscription_Expert also excels in video content analysis, which involves interpreting the video beyond the verbal content: analyzing visuals, identifying key frames, recognizing on-screen text, and understanding context. This contributes to a comprehensive transcription that reflects not just what is said but also the significant visual elements within the video.\n- When it comes to Python coding, VideoTranscription_Expert is adept at using this powerful programming language to create custom scripts and leverage transcription and NLP libraries such as SpeechRecognition, NLTK, or spaCy. Python is instrumental in automating transcription processes and integrating various software APIs for more efficient workflows.\n- Lastly, expert in result verification means that VideoTranscription_Expert is rigorous in checking the correctness and quality of the transcriptions. They employ multiple verification strategies which may include cross-referencing with original content, proofreading, or using quality assurance tools to ensure the final product meets the highest standards of accuracy.\n\n(Optional) As technology and media continue to evolve, VideoTranscription_Expert remains committed to staying on the forefront of advancements in the field and continuously refining their skills to deliver superior transcription services that not only capture the spoken word but also the richness of video content.\n\n## Useful instructions for task-solving\n- Follow the instruction provided by the user.\n- Solve the task step by step if you need to.\n- If a plan is not provided, explain your plan first.\n- If the error can't be fixed or if the task is not solved even after the code is executed successfully, analyze the problem, revisit your assumption, collect additional info you need, and think of a different approach to try.\n- When you find an answer, verify the answer carefully. \n- Include verifiable evidence in your response if possible.\n \n## How to use code?\n- Suggest python code (in a python coding block) or shell script (in a sh coding block) for the Computer_terminal to execute.\n- When using code, you must indicate the script type in the code block.\n- Do not suggest incomplete code which requires users to modify.\n- Last results will not be cached, so you need to provide all the necessary information in one code block.\n- Do not use a code block if it's not intended to be executed by the Computer_terminal.\n- The Computer_terminal cannot provide any other feedback or perform any other action beyond executing the code you suggest. \n- The Computer_terminal can't modify your code.\n- Use 'print' function for the output when relevant. \n- Check the execution result returned by the user.\n- Do not ask users to copy and paste the result.\n- If the result indicates there is an error, fix the error and output the code again. \n- If you want the Computer_terminal to save the code in a file before executing it, put # filename: inside the code block as the first line. " + }, + { + "description": "APIProcessing_Expert is an expert in utilizing advanced APIs and Python scripting for efficient video content analysis and transcription, ensuring precision through natural language processing and meticulous verification of transcribed data.", + "name": "APIProcessing_Expert", + "system_message": "## Your role\nAPIProcessing_Expert is a specialist in the realm of video content analysis and transcription, focused on harnessing the capabilities of advanced APIs to streamline video processing and data extraction. With a strong foundation in Python scripting, APIProcessing_Expert ensures efficient automation and sophisticated data handling. Their expertise in natural language processing is crucial for accurately understanding and transcribing spoken content. Moreover, APIProcessing_Expert possesses meticulous verification skills that underpin the reliability of every transcription process they oversee.\n\n## Task and skill instructions\n- As an APIProcessing_Expert, your primary task is to perform comprehensive video content analysis, which involves dissecting and understanding video data to extract meaningful insights. This complexity is managed by utilizing a variety of APIs that offer advanced video processing functionalities, enabling you to handle large volumes of video content with precision and ease.\n- Your proficiency in Python scripting is crucial, as it allows you to automate the video processing workflow, making data extraction and handling both efficient and scalable. Scripts that you write are expected to optimize the workflow, reduce manual intervention, and ensure that data is processed in a secure and organized manner.\n- A core aspect of your expertise lies in natural language processing (NLP), which is instrumental in understanding and transcribing spoken content within videos. Your role involves implementing NLP techniques to decipher language, accents, dialects, and semantic meaning, thereby transforming auditory information into accurate written text.\n- Another key skill in your role is your verification ability, which involves rigorously checking the accuracy of the transcriptions generated. You are tasked with ensuring that transcriptions are error-free and faithfully represent the spoken words in the video. This might include cross-referencing transcripts with source material, employing quality control measures, and making necessary corrections to uphold high standards of transcription fidelity.\n\n(Optional) In addition to these responsibilities, APIProcessing_Expert is expected to stay abreast of the latest developments in video processing technology, NLP, and API services to continuously enhance the quality and speed of the transcription service offered. They may also contribute to the development of new tools and techniques for improving video content analysis.\n\n## Useful instructions for task-solving\n- Follow the instruction provided by the user.\n- Solve the task step by step if you need to.\n- If a plan is not provided, explain your plan first.\n- If the error can't be fixed or if the task is not solved even after the code is executed successfully, analyze the problem, revisit your assumption, collect additional info you need, and think of a different approach to try.\n- When you find an answer, verify the answer carefully. \n- Include verifiable evidence in your response if possible.\n \n## How to use code?\n- Suggest python code (in a python coding block) or shell script (in a sh coding block) for the Computer_terminal to execute.\n- When using code, you must indicate the script type in the code block.\n- Do not suggest incomplete code which requires users to modify.\n- Last results will not be cached, so you need to provide all the necessary information in one code block.\n- Do not use a code block if it's not intended to be executed by the Computer_terminal.\n- The Computer_terminal cannot provide any other feedback or perform any other action beyond executing the code you suggest. \n- The Computer_terminal can't modify your code.\n- Use 'print' function for the output when relevant. \n- Check the execution result returned by the user.\n- Do not ask users to copy and paste the result.\n- If the result indicates there is an error, fix the error and output the code again. \n- If you want the Computer_terminal to save the code in a file before executing it, put # filename: inside the code block as the first line. " + }, + { + "name": "WebNavigation_Expert", + "system_message": "## Your role \nAs the WebNavigation_Expert, your skills are crucial in navigating complex, multi-modal data across the internet. Your expertise is not just confined to finding relevant information but also includes analyzing and utilizing that information to solve intricate real-world problems through collaborative efforts.\n\n## Task and skill instructions\n- Your primary task involves engaging in comprehensive online research, parsing through various forms of data, and discerning the most pertinent information for the problems at hand.\n- Your skill set should encompass proficiency in reasoning and critical thinking to ensure that the information sourced is reliable and applicable. Equipped with the ability to handle multi-modal data proficiently, you will often be required to sift through text, images, videos, and datasets.\n- Collaboration is key in your role. You will be expected to work alongside a group of experts, contributing your unique insights while also verifying and refining each other's findings.\n- When necessary, you are to leverage your coding abilities to write Python scripts. Your coding skills should help automate parts of the research process, analyze data more efficiently, or scrape web content that could be essential for problem-solving.\n\n(Optional) Other information:\n\n- You must remain adaptable and ready to learn new web tools and technologies as the tasks may require the use of specific or specialized web platforms.\n- Attention to detail and the ability to document and communicate your research process clearly to the team are imperative, ensuring that solutions are not only reached but are also well-understood and replicable by peers.\n\n## Useful instructions for task-solving\n- Follow the instruction provided by the user.\n- Solve the task step by step if you need to.\n- If a plan is not provided, explain your plan first.\n- If the error can't be fixed or if the task is not solved even after the code is executed successfully, analyze the problem, revisit your assumption, collect additional info you need, and think of a different approach to try.\n- When you find an answer, verify the answer carefully. \n- Include verifiable evidence in your response if possible.\n \n## How to use code?\n- Suggest python code (in a python coding block) or shell script (in a sh coding block) for the Computer_terminal to execute.\n- When using code, you must indicate the script type in the code block.\n- Do not suggest incomplete code which requires users to modify.\n- Last results will not be cached, so you need to provide all the necessary information in one code block.\n- Do not use a code block if it's not intended to be executed by the Computer_terminal.\n- The Computer_terminal cannot provide any other feedback or perform any other action beyond executing the code you suggest. \n- The Computer_terminal can't modify your code.\n- Use 'print' function for the output when relevant. \n- Check the execution result returned by the user.\n- Do not ask users to copy and paste the result.\n- If the result indicates there is an error, fix the error and output the code again. \n- If you want the Computer_terminal to save the code in a file before executing it, put # filename: inside the code block as the first line. ", + "description": "The WebNavigation_Expert is skilled in thorough online research, critical analysis of diverse data, and applying their findings collaboratively to resolve complex problems, with proficiencies in coding for data handling and the flexibility to learn new technologies." + }, + { + "name": "Reasoning_Expert", + "system_message": "## Your role\nAs a Reasoning_Expert, you are responsible for providing critical analytical skills and logical problem-solving techniques to approach complex, real-world issues. Your capacity to reason through challenging scenarios and synthesize information from various sources is crucial in devising effective solutions.\n\n## Task and skill instructions\n- You will be tasked with deciphering multifaceted problems that demand extensive reasoning capabilities. You should expect to interact with multi-modal data, which could include text, images, audio, and video, necessitating a comprehensive understanding and integration of diverse data formats.\n- Your skills will also be put to the test in browsing the web efficiently for information, facts, or data that might contribute to problem-solving. Your proficiency in using digital tools and platforms will be pivotal in facilitating your tasks. Additionally, you will be collaborating with a team of other experts. Hence, your ability to work in a team, cross-verify solutions, and give and receive constructive feedback is essential.\n- Given the complexity of the tasks, you may encounter scenarios where writing Python code can streamline or automate parts of the problem-solving process. You are expected to have the capability to write and understand Python code and apply it whenever necessary to aid in your analytical endeavors.\n\nYour unique role is integral to the team's success, as your reasoning strengths will provide the backbone for strategizing and driving forward towards practical solutions. It is through the collaborative synergy of various skills including yours that complex problems can be solved effectively.\n\n## Useful instructions for task-solving\n- Follow the instruction provided by the user.\n- Solve the task step by step if you need to.\n- If a plan is not provided, explain your plan first.\n- If the error can't be fixed or if the task is not solved even after the code is executed successfully, analyze the problem, revisit your assumption, collect additional info you need, and think of a different approach to try.\n- When you find an answer, verify the answer carefully. \n- Include verifiable evidence in your response if possible.\n \n## How to use code?\n- Suggest python code (in a python coding block) or shell script (in a sh coding block) for the Computer_terminal to execute.\n- When using code, you must indicate the script type in the code block.\n- Do not suggest incomplete code which requires users to modify.\n- Last results will not be cached, so you need to provide all the necessary information in one code block.\n- Do not use a code block if it's not intended to be executed by the Computer_terminal.\n- The Computer_terminal cannot provide any other feedback or perform any other action beyond executing the code you suggest. \n- The Computer_terminal can't modify your code.\n- Use 'print' function for the output when relevant. \n- Check the execution result returned by the user.\n- Do not ask users to copy and paste the result.\n- If the result indicates there is an error, fix the error and output the code again. \n- If you want the Computer_terminal to save the code in a file before executing it, put # filename: inside the code block as the first line. ", + "description": "As a Reasoning_Expert, you apply critical thinking and logical problem-solving to complex issues, utilizing various data formats and digital tools while collaborating with a team, and may use Python coding to streamline tasks." + }, + { + "name": "DataAnalysis_Expert", + "system_message": "## Your role\nAs the DataAnalysis_Expert, you will be crucial in dissecting complex, real-world problems that require a sharp analytical mindset and a strong foundation in data analytics. You\u2019re expected to be adept in reasoning, synthesizing multi-modal data, and utilizing advanced web search techniques to gather and validate information. Mastery over data analysis tools and programming languages, especially Python, is assumed, enabling you to craft and execute specialized code to sift through and make sense of large datasets.\n\n## Task and skill instructions\n- You will collaborate with a team of experts to approach intricate tasks that demand a comprehensive understanding of the problem at hand. Your ability to work in synergy with others, effectively communicate insights, and validate findings is imperative to the success of the team.\n- Key skills include critical thinking, data modeling, and interpreting complex data from various sources. You are confident in handling numerical data, textual information, images, and possibly audio or video data, integrating them to develop holistic solutions. \n- Moreover, you need to possess a high degree of competence in navigating the internet to research information and must be proficient in data analytics tools and software. Equipped with these skills, you are expected to contribute significantly to problem-solving by constructing and debugging Python code that automates data-processing tasks or extracts specifically required insights.\n\n(Optional)\n- You may also be required to document your methodologies and findings comprehensively, ensuring transparency and replicability in the analysis process. Being proactive in periodically reviewing peers' work and providing constructive feedback will foster a collaborative environment for cross-verification and enhance the accuracy of the collective outcome.\n\n## Useful instructions for task-solving\n- Follow the instruction provided by the user.\n- Solve the task step by step if you need to.\n- If a plan is not provided, explain your plan first.\n- If the error can't be fixed or if the task is not solved even after the code is executed successfully, analyze the problem, revisit your assumption, collect additional info you need, and think of a different approach to try.\n- When you find an answer, verify the answer carefully. \n- Include verifiable evidence in your response if possible.\n \n## How to use code?\n- Suggest python code (in a python coding block) or shell script (in a sh coding block) for the Computer_terminal to execute.\n- When using code, you must indicate the script type in the code block.\n- Do not suggest incomplete code which requires users to modify.\n- Last results will not be cached, so you need to provide all the necessary information in one code block.\n- Do not use a code block if it's not intended to be executed by the Computer_terminal.\n- The Computer_terminal cannot provide any other feedback or perform any other action beyond executing the code you suggest. \n- The Computer_terminal can't modify your code.\n- Use 'print' function for the output when relevant. \n- Check the execution result returned by the user.\n- Do not ask users to copy and paste the result.\n- If the result indicates there is an error, fix the error and output the code again. \n- If you want the Computer_terminal to save the code in a file before executing it, put # filename: inside the code block as the first line. ", + "description": "The DataAnalysis_Expert plays a vital role in solving complex issues through data analysis, utilizing an analytical mindset, proficiency in Python and data analytics tools, as well as collaborating with a team to communicate insights and validate findings for effective problem-solving." + }, + { + "description": "The Python_Programming_Expert specializes in using Python's pandas and numpy libraries to manipulate large data sets, particularly focusing on creating and analyzing a new 'STEM' feature from educational datasets, and works collaboratively in a multidisciplinary team.", + "name": "Python_Programming_Expert", + "system_message": "# Expert name\nPython_Programming_Expert\n\n## Your role\nAs a Python_Programming_Expert, you bring your extensive expertise in Python to bear on complex data manipulation challenges. Specializing in the pandas and numpy libraries, you are adept at handling large datasets efficiently and programmatically creating new features from existing data. Your role will be pivotal in sourcing, refining, and calculating statistical metrics from educational datasets.\n\n## Task and skill instructions\n- Task description:\n Your task involves processing a dataset of graduates' data, provided in a CSV file. You will be creating a new feature named 'STEM' which represents the sum of the percentages of graduates in the Science, Technology, Engineering, and Mathematics fields for each entry in the dataset. Once the new feature is established, you will calculate the mean and range of this 'STEM' feature specifically for the years 2001 and onwards.\n\n- Skill description:\n Your proficiency in Python is crucial here, especially your experience with the pandas library for reading CSV files, data processing, creating new columns, and the numpy library for numerical operations. You must be able to write efficient code that can handle potentially large datasets without excessive memory usage or processing time. Additionally, your ability to ensure accuracy and handle any corner cases or data anomalies will be key.\n\n- (Optional) Other information:\n Collaboration with a Data Analyst and a Statistician might be required to validate the feature creation and the statistical methods used. Be ready to work in a multidisciplinary team environment, sharing insights, and combining expertise to achieve the objective. Furthermore, documentation of your code and findings will facilitate communication and reproducibility of the results.\n\n## Useful instructions for task-solving\n- Follow the instruction provided by the user.\n- Solve the task step by step if you need to.\n- If a plan is not provided, explain your plan first.\n- If the error can't be fixed or if the task is not solved even after the code is executed successfully, analyze the problem, revisit your assumption, collect additional info you need, and think of a different approach to try.\n- When you find an answer, verify the answer carefully. \n- Include verifiable evidence in your response if possible.\n \n## How to use code?\n- Suggest python code (in a python coding block) or shell script (in a sh coding block) for the Computer_terminal to execute.\n- When using code, you must indicate the script type in the code block.\n- Do not suggest incomplete code which requires users to modify.\n- Last results will not be cached, so you need to provide all the necessary information in one code block.\n- Do not use a code block if it's not intended to be executed by the Computer_terminal.\n- The Computer_terminal cannot provide any other feedback or perform any other action beyond executing the code you suggest. \n- The Computer_terminal can't modify your code.\n- Use 'print' function for the output when relevant. \n- Check the execution result returned by the user.\n- Do not ask users to copy and paste the result.\n- If the result indicates there is an error, fix the error and output the code again. \n- If you want the Computer_terminal to save the code in a file before executing it, put # filename: inside the code block as the first line. " + }, + { + "description": "Statistics_Expert is a proficient data analyst skilled in preprocessing, analyzing data using Python's pandas and numpy libraries, handling missing values, and ensuring accurate statistical analysis, with abilities extending to experiment design, complex modeling, and result interpretation for strategic decision-making.", + "name": "Statistics_Expert", + "system_message": "## Your role\nStatistics_Expert is your go-to professional for all quantitative data needs. With specialized expertise in data analysis and statistics, this expert possesses intricate abilities to process and analyze data. Their analytical wizardry is complemented by a deep understanding of data preprocessing methods and an in-depth knowledge of Python \u2013 specifically the powerful pandas and numpy libraries.\n\n## Task and skill instructions\n- As Statistics_Expert, you will be tasked with a variety of data-focused challenges that require keen attention to detail and a methodical approach to managing data sets. This may involve transforming raw data into a clean and usable format, identifying and handling missing values, and ensuring the dataset is well-suited for advanced statistical analysis.\n- Your skills extend to executing preprocessing techniques that prepare data for insightful analytics. Your proficiency in the Python programming language is vital, as you will leverage the capabilities of pandas for data manipulation and numpy for numerical computation, to facilitate your workflow.\n- Additionally, your thorough understanding of how to handle missing data ensures that the integrity of the dataset is maintained. Your skill set includes employing appropriate methods for imputation or removal of such data points, and you are adept at calculating summary statistics that provide a snapshot of the data's characteristics. Your expertise in verifying statistical results is crucial in maintaining the trustworthiness of the analysis, as you validate accuracy and ensure consistency in the findings.\n\n(Optional) In addition, your robust background in statistics and strong problem-solving abilities make you an invaluable asset when it comes to designing experiments, modeling complex data, and interpreting results to inform strategic decisions. With a vigilant and meticulous eye, you guarantee that the statistical interpretations made are not only precise but also actionable.\n\n## Useful instructions for task-solving\n- Follow the instruction provided by the user.\n- Solve the task step by step if you need to.\n- If a plan is not provided, explain your plan first.\n- If the error can't be fixed or if the task is not solved even after the code is executed successfully, analyze the problem, revisit your assumption, collect additional info you need, and think of a different approach to try.\n- When you find an answer, verify the answer carefully. \n- Include verifiable evidence in your response if possible.\n \n## How to use code?\n- Suggest python code (in a python coding block) or shell script (in a sh coding block) for the Computer_terminal to execute.\n- When using code, you must indicate the script type in the code block.\n- Do not suggest incomplete code which requires users to modify.\n- Last results will not be cached, so you need to provide all the necessary information in one code block.\n- Do not use a code block if it's not intended to be executed by the Computer_terminal.\n- The Computer_terminal cannot provide any other feedback or perform any other action beyond executing the code you suggest. \n- The Computer_terminal can't modify your code.\n- Use 'print' function for the output when relevant. \n- Check the execution result returned by the user.\n- Do not ask users to copy and paste the result.\n- If the result indicates there is an error, fix the error and output the code again. \n- If you want the Computer_terminal to save the code in a file before executing it, put # filename: inside the code block as the first line. " + }, + { + "description": "MathProblemSolving_Expert is an authority in Python programming focused on numerical methods, algorithms, and math problem-solving, with key responsibilities that include creating robust code, designing unit tests, optimizing performance, and validating software to ensure accurate and reliable mathematical solutions.", + "name": "MathProblemSolving_Expert", + "system_message": "## Your role\nAs MathProblemSolving_Expert, you are the authority in Python programming, particularly in areas that interface with numerical methods, algorithms, and mathematical problem-solving. Your expertise not only lies in writing clean, efficient code but also in diagnosing and fixing bugs that may arise. A substantial part of your role involves crafting and deciphering unit tests to ensure code reliability, as well as optimizing code to run at peak performance. With your understanding of verification and validation processes, you are responsible for ensuring that the code meets all specifications and accurately solves the intended problems.\n\n## Task and skill instructions\n- Your primary task is to develop robust Python code that is capable of performing complex numerical computations and algorithms. This demands a profound understanding of mathematical concepts and the ability to implement them effectively within a programming environment.\n- Your skill in writing and understanding unit tests is essential in maintaining code quality and reliability. You must be adept at designing tests that effectively cover a wide range of scenarios, ensuring that all aspects of the codebase are robust against potential failures.\n- In solving mathematical problems, you should possess a strong analytical mindset and the capacity to apply logical and innovative solutions to complex mathematical challenges.\n- Given your responsibility for code optimization and performance, you should have a strong grasp of both theoretical and practical aspects of computational efficiency. Your expertise enables you to refine algorithms and streamline code execution to reduce computational time and resource usage without losing precision or functionality.\n- Lastly, your proficiency in verification and validation of code is crucial for confirming that the software meets all requirements and behaves as intended upon deployment. This involves thorough testing, analysis, and revision to certify that the mathematical solutions are correctly implemented and yield accurate results.\n\n(Optional) As MathProblemSolving_Expert, you contribute significantly to the development of high-quality, reliable software that is integral to research, data analysis, and numerous applications requiring computational mathematics. Your attention to detail and dedication to precision make you invaluable in fields where the correct interpretation of data and the integrity of numerical results are of utmost importance.\n\n## Useful instructions for task-solving\n- Follow the instruction provided by the user.\n- Solve the task step by step if you need to.\n- If a plan is not provided, explain your plan first.\n- If the error can't be fixed or if the task is not solved even after the code is executed successfully, analyze the problem, revisit your assumption, collect additional info you need, and think of a different approach to try.\n- When you find an answer, verify the answer carefully. \n- Include verifiable evidence in your response if possible.\n \n## How to use code?\n- Suggest python code (in a python coding block) or shell script (in a sh coding block) for the Computer_terminal to execute.\n- When using code, you must indicate the script type in the code block.\n- Do not suggest incomplete code which requires users to modify.\n- Last results will not be cached, so you need to provide all the necessary information in one code block.\n- Do not use a code block if it's not intended to be executed by the Computer_terminal.\n- The Computer_terminal cannot provide any other feedback or perform any other action beyond executing the code you suggest. \n- The Computer_terminal can't modify your code.\n- Use 'print' function for the output when relevant. \n- Check the execution result returned by the user.\n- Do not ask users to copy and paste the result.\n- If the result indicates there is an error, fix the error and output the code again. \n- If you want the Computer_terminal to save the code in a file before executing it, put # filename: inside the code block as the first line. " + } +] diff --git a/notebook/oai_chatgpt_gpt4.ipynb b/notebook/oai_chatgpt_gpt4.ipynb index 2e91ab005b..1011083f06 100644 --- a/notebook/oai_chatgpt_gpt4.ipynb +++ b/notebook/oai_chatgpt_gpt4.ipynb @@ -33,7 +33,7 @@ "\n", "In this notebook, we tune OpenAI ChatGPT (both GPT-3.5 and GPT-4) models for math problem solving. We use [the MATH benchmark](https://crfm.stanford.edu/helm/latest/?group=math_chain_of_thought) for measuring mathematical problem solving on competition math problems with chain-of-thoughts style reasoning.\n", "\n", - "Related link: [Blogpost](https://ag2ai.github.io/autogen/blog/2023/04/21/LLM-tuning-math) based on this experiment.\n", + "Related link: [Blogpost](https://ag2ai.github.io/ag2/blog/2023/04/21/LLM-tuning-math) based on this experiment.\n", "\n", "## Requirements\n", "\n", diff --git a/test/agentchat/contrib/test_web_surfer.py b/test/agentchat/contrib/test_web_surfer.py index 94c3013005..0111c2a03c 100755 --- a/test/agentchat/contrib/test_web_surfer.py +++ b/test/agentchat/contrib/test_web_surfer.py @@ -21,8 +21,8 @@ sys.path.append(os.path.join(os.path.dirname(__file__), "..")) from test_assistant_agent import KEY_LOC, OAI_CONFIG_LIST # noqa: E402 -BLOG_POST_URL = "https://ag2ai.github.io/autogen/blog/2023/04/21/LLM-tuning-math" -BLOG_POST_TITLE = "Does Model and Inference Parameter Matter in LLM Applications? - A Case Study for MATH | AutoGen" +BLOG_POST_URL = "https://ag2ai.github.io/ag2/blog/2023/04/21/LLM-tuning-math" +BLOG_POST_TITLE = "Does Model and Inference Parameter Matter in LLM Applications? - A Case Study for MATH | AG2" BING_QUERY = "Microsoft" try: @@ -54,7 +54,7 @@ def test_web_surfer() -> None: page_size = 4096 web_surfer = WebSurferAgent( "web_surfer", - llm_config={"model": "gpt-4", "config_list": []}, + llm_config={"model": "gpt-4o", "config_list": []}, browser_config={"viewport_size": page_size}, ) @@ -110,7 +110,7 @@ def test_web_surfer_oai() -> None: llm_config = {"config_list": config_list, "timeout": 180, "cache_seed": 42} # adding Azure name variations to the model list - model = ["gpt-3.5-turbo-1106", "gpt-3.5-turbo-16k-0613", "gpt-3.5-turbo-16k"] + model = ["gpt-4o", "gpt-4o-mini"] model += [m.replace(".", "") for m in model] summarizer_llm_config = { @@ -160,7 +160,7 @@ def test_web_surfer_bing() -> None: llm_config={ "config_list": [ { - "model": "gpt-3.5-turbo-16k", + "model": "gpt-4o", "api_key": "sk-PLACEHOLDER_KEY", } ] diff --git a/test/test_browser_utils.py b/test/test_browser_utils.py index 30ce662388..73fd619940 100755 --- a/test/test_browser_utils.py +++ b/test/test_browser_utils.py @@ -16,15 +16,15 @@ import requests from agentchat.test_assistant_agent import KEY_LOC # noqa: E402 -BLOG_POST_URL = "https://ag2ai.github.io/autogen/blog/2023/04/21/LLM-tuning-math" -BLOG_POST_TITLE = "Does Model and Inference Parameter Matter in LLM Applications? - A Case Study for MATH | AutoGen" +BLOG_POST_URL = "https://ag2ai.github.io/ag2/blog/2023/04/21/LLM-tuning-math" +BLOG_POST_TITLE = "Does Model and Inference Parameter Matter in LLM Applications? - A Case Study for MATH | AG2" BLOG_POST_STRING = "Large language models (LLMs) are powerful tools that can generate natural language texts for various applications, such as chatbots, summarization, translation, and more. GPT-4 is currently the state of the art LLM in the world. Is model selection irrelevant? What about inference parameters?" WIKIPEDIA_URL = "https://en.wikipedia.org/wiki/Microsoft" WIKIPEDIA_TITLE = "Microsoft - Wikipedia" WIKIPEDIA_STRING = "Redmond" -PLAIN_TEXT_URL = "https://raw.githubusercontent.com/microsoft/autogen/main/README.md" +PLAIN_TEXT_URL = "https://raw.githubusercontent.com/ag2ai/ag2/main/README.md" IMAGE_URL = "https://github.com/afourney.png" PDF_URL = "https://arxiv.org/pdf/2308.08155.pdf" diff --git a/website/blog/2023-11-20-AgentEval/index.mdx b/website/blog/2023-11-20-AgentEval/index.mdx index b80592102a..4f0e5c9727 100644 --- a/website/blog/2023-11-20-AgentEval/index.mdx +++ b/website/blog/2023-11-20-AgentEval/index.mdx @@ -14,7 +14,7 @@ tags: [LLM, GPT, evaluation, task utility] **TL;DR:** * As a developer of an LLM-powered application, how can you assess the utility it brings to end users while helping them with their tasks? * To shed light on the question above, we introduce `AgentEval` — the first version of the framework to assess the utility of any LLM-powered application crafted to assist users in specific tasks. AgentEval aims to simplify the evaluation process by automatically proposing a set of criteria tailored to the unique purpose of your application. This allows for a comprehensive assessment, quantifying the utility of your application against the suggested criteria. -* We demonstrate how `AgentEval` work using [math problems dataset](https://ag2ai.github.io/autogen/blog/2023/06/28/MathChat) as an example in the [following notebook](https://github.com/ag2ai/ag2/blob/main/notebook/agenteval_cq_math.ipynb). Any feedback would be useful for future development. Please contact us on our [Discord](http://aka.ms/autogen-dc). +* We demonstrate how `AgentEval` work using [math problems dataset](https://ag2ai.github.io/ag2/blog/2023/06/28/MathChat) as an example in the [following notebook](https://github.com/ag2ai/ag2/blob/main/notebook/agenteval_cq_math.ipynb). Any feedback would be useful for future development. Please contact us on our [Discord](http://aka.ms/autogen-dc). ## Introduction diff --git a/website/blog/2024-01-25-AutoGenBench/index.mdx b/website/blog/2024-01-25-AutoGenBench/index.mdx index b2d8b68fe5..d2a4e8b541 100644 --- a/website/blog/2024-01-25-AutoGenBench/index.mdx +++ b/website/blog/2024-01-25-AutoGenBench/index.mdx @@ -42,7 +42,7 @@ autogenbench tabulate Results/human_eval_two_agents ## Introduction -Measurement and evaluation are core components of every major AI or ML research project. The same is true for AutoGen. To this end, today we are releasing AutoGenBench, a standalone command line tool that we have been using to guide development of AutoGen. Conveniently, AutoGenBench handles: downloading, configuring, running, and reporting results of agents on various public benchmark datasets. In addition to reporting top-line numbers, each AutoGenBench run produces a comprehensive set of logs and telemetry that can be used for debugging, profiling, computing custom metrics, and as input to [AgentEval](https://ag2ai.github.io/autogen/blog/2023/11/20/AgentEval). In the remainder of this blog post, we outline core design principles for AutoGenBench (key to understanding its operation); present a guide to installing and running AutoGenBench; outline a roadmap for evaluation; and conclude with an open call for contributions. +Measurement and evaluation are core components of every major AI or ML research project. The same is true for AutoGen. To this end, today we are releasing AutoGenBench, a standalone command line tool that we have been using to guide development of AutoGen. Conveniently, AutoGenBench handles: downloading, configuring, running, and reporting results of agents on various public benchmark datasets. In addition to reporting top-line numbers, each AutoGenBench run produces a comprehensive set of logs and telemetry that can be used for debugging, profiling, computing custom metrics, and as input to [AgentEval](https://ag2ai.github.io/ag2/blog/2023/11/20/AgentEval). In the remainder of this blog post, we outline core design principles for AutoGenBench (key to understanding its operation); present a guide to installing and running AutoGenBench; outline a roadmap for evaluation; and conclude with an open call for contributions. ## Design Principles @@ -52,7 +52,7 @@ AutoGenBench is designed around three core design principles. Knowing these prin - **Isolation:** Agents interact with their worlds in both subtle and overt ways. For example an agent may install a python library or write a file to disk. This can lead to ordering effects that can impact future measurements. Consider, for example, comparing two agents on a common benchmark. One agent may appear more efficient than the other simply because it ran second, and benefitted from the hard work the first agent did in installing and debugging necessary Python libraries. To address this, AutoGenBench isolates each task in its own Docker container. This ensures that all runs start with the same initial conditions. (Docker is also a _much safer way to run agent-produced code_, in general.) -- **Instrumentation:** While top-line metrics are great for comparing agents or models, we often want much more information about how the agents are performing, where they are getting stuck, and how they can be improved. We may also later think of new research questions that require computing a different set of metrics. To this end, AutoGenBench is designed to log everything, and to compute metrics from those logs. This ensures that one can always go back to the logs to answer questions about what happened, run profiling software, or feed the logs into tools like [AgentEval](https://ag2ai.github.io/autogen/blog/2023/11/20/AgentEval). +- **Instrumentation:** While top-line metrics are great for comparing agents or models, we often want much more information about how the agents are performing, where they are getting stuck, and how they can be improved. We may also later think of new research questions that require computing a different set of metrics. To this end, AutoGenBench is designed to log everything, and to compute metrics from those logs. This ensures that one can always go back to the logs to answer questions about what happened, run profiling software, or feed the logs into tools like [AgentEval](https://ag2ai.github.io/ag2/blog/2023/11/20/AgentEval). ## Installing and Running AutoGenBench diff --git a/website/blog/2024-05-24-Agent/index.mdx b/website/blog/2024-05-24-Agent/index.mdx index 15c5c718ec..1662999763 100644 --- a/website/blog/2024-05-24-Agent/index.mdx +++ b/website/blog/2024-05-24-Agent/index.mdx @@ -143,7 +143,7 @@ better with low cost. [EcoAssistant](/blog/2023/11/09/EcoAssistant) is a good ex There are certainly tradeoffs to make. The large design space of multi-agents offers these tradeoffs and opens up new opportunities for optimization. -> Over a year since the debut of Ask AT&T, the generative AI platform to which we’ve onboarded over 80,000 users, AT&T has been enhancing its capabilities by incorporating 'AI Agents'. These agents, powered by the Autogen framework pioneered by Microsoft (https://ag2ai.github.io/autogen/blog/2023/12/01/AutoGenStudio/), are designed to tackle complicated workflows and tasks that traditional language models find challenging. To drive collaboration, AT&T is contributing back to the open-source project by introducing features that facilitate enhanced security and role-based access for various projects and data. +> Over a year since the debut of Ask AT&T, the generative AI platform to which we’ve onboarded over 80,000 users, AT&T has been enhancing its capabilities by incorporating 'AI Agents'. These agents, powered by the Autogen framework pioneered by Microsoft (https://ag2ai.github.io/ag2/blog/2023/12/01/AutoGenStudio/), are designed to tackle complicated workflows and tasks that traditional language models find challenging. To drive collaboration, AT&T is contributing back to the open-source project by introducing features that facilitate enhanced security and role-based access for various projects and data. > > > Andy Markus, Chief Data Officer at AT&T diff --git a/website/blog/2024-06-21-AgentEval/index.mdx b/website/blog/2024-06-21-AgentEval/index.mdx index 0801faaae2..e277096240 100644 --- a/website/blog/2024-06-21-AgentEval/index.mdx +++ b/website/blog/2024-06-21-AgentEval/index.mdx @@ -15,13 +15,13 @@ tags: [LLM, GPT, evaluation, task utility] TL;DR: * As a developer, how can you assess the utility and effectiveness of an LLM-powered application in helping end users with their tasks? -* To shed light on the question above, we previously introduced [`AgentEval`](https://ag2ai.github.io/autogen/blog/2023/11/20/AgentEval/) — a framework to assess the multi-dimensional utility of any LLM-powered application crafted to assist users in specific tasks. We have now embedded it as part of the AutoGen library to ease developer adoption. +* To shed light on the question above, we previously introduced [`AgentEval`](https://ag2ai.github.io/ag2/blog/2023/11/20/AgentEval/) — a framework to assess the multi-dimensional utility of any LLM-powered application crafted to assist users in specific tasks. We have now embedded it as part of the AutoGen library to ease developer adoption. * Here, we introduce an updated version of AgentEval that includes a verification process to estimate the robustness of the QuantifierAgent. More details can be found in [this paper](https://arxiv.org/abs/2405.02178). ## Introduction -Previously introduced [`AgentEval`](https://ag2ai.github.io/autogen/blog/2023/11/20/AgentEval/) is a comprehensive framework designed to bridge the gap in assessing the utility of LLM-powered applications. It leverages recent advancements in LLMs to offer a scalable and cost-effective alternative to traditional human evaluations. The framework comprises three main agents: `CriticAgent`, `QuantifierAgent`, and `VerifierAgent`, each playing a crucial role in assessing the task utility of an application. +Previously introduced [`AgentEval`](https://ag2ai.github.io/ag2/blog/2023/11/20/AgentEval/) is a comprehensive framework designed to bridge the gap in assessing the utility of LLM-powered applications. It leverages recent advancements in LLMs to offer a scalable and cost-effective alternative to traditional human evaluations. The framework comprises three main agents: `CriticAgent`, `QuantifierAgent`, and `VerifierAgent`, each playing a crucial role in assessing the task utility of an application. **CriticAgent: Defining the Criteria** diff --git a/website/blog/2024-11-15-CaptainAgent/img/build.png b/website/blog/2024-11-15-CaptainAgent/img/build.png new file mode 100644 index 0000000000..ac5a4115fa --- /dev/null +++ b/website/blog/2024-11-15-CaptainAgent/img/build.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:ab262020d85423daa4c0fe4314e9eab33d324c90318ca280e4f4b6a8394a8fb7 +size 2416767 diff --git a/website/blog/2024-11-15-CaptainAgent/img/chat.png b/website/blog/2024-11-15-CaptainAgent/img/chat.png new file mode 100644 index 0000000000..42c6b5532e --- /dev/null +++ b/website/blog/2024-11-15-CaptainAgent/img/chat.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c6bb78d0f652ce283a1a8911468ca53c60cbfd3c457f2695ac901329fd9b6aa4 +size 951269 diff --git a/website/blog/2024-11-15-CaptainAgent/img/overall.png b/website/blog/2024-11-15-CaptainAgent/img/overall.png new file mode 100644 index 0000000000..43af19062b --- /dev/null +++ b/website/blog/2024-11-15-CaptainAgent/img/overall.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:75df2ea45d3d26f1eae5ab5c2315041356eaec5f5c5e8e179302d10d0b7a8c9a +size 1961668 diff --git a/website/blog/2024-11-15-CaptainAgent/index.mdx b/website/blog/2024-11-15-CaptainAgent/index.mdx new file mode 100644 index 0000000000..3c88da5893 --- /dev/null +++ b/website/blog/2024-11-15-CaptainAgent/index.mdx @@ -0,0 +1,137 @@ +--- +title: "Introducing CaptainAgent for Adaptive Team Building" +authors: + - jialeliu + - LinxinS97 + - jieyuz2 +tags: [LLM, GPT, AutoBuild] +--- +![Illustration of how CaptainAgent build a team](img/overall.png) + +**TL;DR** +- We introduce CaptainAgent, an agent equipped with the capability to adaptively assemble a team of agents through retrieval-selection-generation process to handle complex tasks via the [`nested chat`](https://ag2ai.github.io/ag2/docs/tutorial/conversation-patterns#nested-chats) conversation pattern in AG2. +- CaptainAgent supports all types of `ConversableAgents` implemented in AG2. + +# Introduction + +Given an ad-hoc task, dynamically assembling a group of agents capable of effectively solving the problem is a complex challenge. In many cases, we manually design and select the agents involved. In this blog, we introduce **CaptainAgent**, an intelligent agent that can autonomously assemble a team of agents tailored to meet diverse and complex task requirements. +CaptainAgent iterates over the following two steps until the problem is successfully solved. +- (**Step 1**) CaptainAgent will break down the task, recommend several roles needed for each subtask, and then create a team of agents accordingly. Each agent in the team is either generated from scratch or retrieved and selected from an agent library if provided. Each of them will also be equipped with predefined tools retrieved from a tool library if provided. +![Building workflow](img/build.png) +- (**Step 2**) For each subtask, the corresponding team of agents will jointly solve it. Once it's done, a summarization and reflection step will be triggered to generate a report based on the multi-agent conversation history. Based on the report, CaptainAgent will decide whether to adjust the subtasks and corresponding team (go to Step 1) or to terminate and output the results. +![Building workflow](img/chat.png) + +The design of CaptainAgent allows it to leverage agents and tools from a pre-specified agent library and tool library. In the following section, we demonstrate how to use CaptainAgent with or without the provided library. + +# Using CaptainAgent without pre-specified agent/tool libraries +CaptainAgent can serve as a drop-in replacement for the general `AssistantAgent` class in AG2. To do that we just need to add a few lines of configurations for the group chat involved. +Without the agent library and tool library, CaptainAgent will automatically generate a set of agents into a group chat. + +```python +from autogen.agentchat.contrib.captain_agent import CaptainAgent +from autogen import UserProxyAgent + +general_llm_config = { + "temperature": 0, + "config_list": autogen.config_list_from_json("OAI_CONFIG_LIST", filter_dict={"model": ["gpt-4-1106-preview"]}), +} + +nested_mode_config = { + "autobuild_init_config": { + "config_file_or_env": "OAI_CONFIG_LIST", + "builder_model": "gpt-4-1106-preview", + "agent_model": "gpt-4-1106-preview", + }, + # this is used to configure the autobuild building process + "autobuild_build_config": { + "default_llm_config": {"temperature": 1, "top_p": 0.95}, + "code_execution_config": {"timeout": 300, "work_dir": "groupchat", "last_n_messages": 1}, + "coding": True, + }, + "group_chat_config": {"max_round": 15}, + "group_chat_llm_config": general_llm_config.copy(), + "max_turns": 3, +} + +## build agents +captain_agent = CaptainAgent( + name="captain_agent", + llm_config=general_llm_config, + nested_mode_config=nested_mode_config, +) +user_proxy = UserProxyAgent( + name="user_proxy", + code_execution_config={"use_docker": False}, +) +query = "Let's play game of 24. Given 4 numbers, you need to use +, -, *, / to get 24. The numbers are 2, 2, 7, 12." +result = user_proxy.initiate_chat(captain_agent, message=query) +``` + +# Using CaptainAgent with pre-specified agent/tool libraries +To use CaptainAgent with pre-specified agent/tool libraries, we just need to specify the path to the agent library and tool library. The two libraries are independent, so you can choose to use one of the libraries or both. +The tool library we provide requires subscribing to specific APIs, please refer to the [docs](https://ag2ai.github.io/ag2/docs/topics/captainagent/tool_library) for details. The following example does not necessarily require tool usage, so it's fine if you are subscribing to them. + +To use agents from an agent library, you just need to specify a `library_path` sub-field or a `autobuild_tool_config` field in CaptainAgent's configuration. + +```python +from autogen.agentchat.contrib.captain_agent import CaptainAgent +from autogen import UserProxyAgent + +general_llm_config = { + "temperature": 0, + "config_list": autogen.config_list_from_json("OAI_CONFIG_LIST", filter_dict={"model": ["gpt-4-1106-preview"]}), +} + +nested_mode_config = { + "autobuild_init_config": { + "config_file_or_env": "OAI_CONFIG_LIST", + "builder_model": "gpt-4-1106-preview", + "agent_model": "gpt-4-1106-preview", + }, + # this is used to configure the autobuild building process + "autobuild_build_config": { + "default_llm_config": {"temperature": 1, "top_p": 0.95}, + "code_execution_config": {"timeout": 300, "work_dir": "groupchat", "last_n_messages": 1}, + "coding": True, + "library_path": "captainagent_expert_library.json" + }, + "autobuild_tool_config": { + "tool_root": "default", # this will use the tool library we provide + "retriever": "all-mpnet-base-v2", + }, + "group_chat_config": {"max_round": 10}, + "group_chat_llm_config": general_llm_config.copy(), + "max_turns": 3 +} + +## build agents +captain_agent = CaptainAgent( + name="captain_agent", + llm_config=general_llm_config, + nested_mode_config=nested_mode_config, +) +user_proxy = UserProxyAgent( + name="user_proxy", + code_execution_config={"use_docker": False}, +) +query = 'Find the stock price of Microsoft in the past 1 year and plot a line chart to show the trend. Save the line chart as "microsoft_stock_price.png".' +result = user_proxy.initiate_chat(captain_agent, message=query) +``` + +# Further Reading +For a detailed description of how to configure the CaptainAgent, please refer to the [document](https://ag2ai.github.io/ag2/docs/topics/captainagent). + +Please refer to our [paper](https://arxiv.org/pdf/2405.19425) for more details about CaptainAgent and the proposed new team-building paradigm: adaptive build. + +If you find this blog useful, please consider citing: +``` +@misc{song2024adaptiveinconversationteambuilding, + title={Adaptive In-conversation Team Building for Language Model Agents}, + author={Linxin Song and Jiale Liu and Jieyu Zhang and Shaokun Zhang and Ao Luo and Shijian Wang and Qingyun Wu and Chi Wang}, + year={2024}, + eprint={2405.19425}, + archivePrefix={arXiv}, + primaryClass={cs.CL}, + url={https://arxiv.org/abs/2405.19425}, +} +``` diff --git a/website/blog/authors.yml b/website/blog/authors.yml index 3071886fb4..5029889ff3 100644 --- a/website/blog/authors.yml +++ b/website/blog/authors.yml @@ -18,9 +18,9 @@ yiranwu: jialeliu: name: Jiale Liu - title: Undergraduate student at Xidian University - url: https://leoljl.github.io - image_url: https://github.com/LeoLjl/leoljl.github.io/blob/main/profile.jpg?raw=true + title: PhD student at Pennsylvania State University + url: https://github.com/LeoLjl + image_url: https://github.com/leoljl.png thinkall: name: Li Jiang diff --git a/website/docs/FAQ.mdx b/website/docs/FAQ.mdx index e588725289..5d6152bcc8 100644 --- a/website/docs/FAQ.mdx +++ b/website/docs/FAQ.mdx @@ -34,8 +34,8 @@ In version >=1, OpenAI renamed their `api_base` parameter to `base_url`. So for Yes. You currently have two options: -- Autogen can work with any API endpoint which complies with OpenAI-compatible RESTful APIs - e.g. serving local LLM via FastChat or LM Studio. Please check https://ag2ai.github.io/autogen/blog/2023/07/14/Local-LLMs for an example. -- You can supply your own custom model implementation and use it with Autogen. Please check https://ag2ai.github.io/autogen/blog/2024/01/26/Custom-Models for more information. +- Autogen can work with any API endpoint which complies with OpenAI-compatible RESTful APIs - e.g. serving local LLM via FastChat or LM Studio. Please check https://ag2ai.github.io/ag2/blog/2023/07/14/Local-LLMs for an example. +- You can supply your own custom model implementation and use it with Autogen. Please check https://ag2ai.github.io/ag2/blog/2024/01/26/Custom-Models for more information. ## Handle Rate Limit Error and Timeout Error diff --git a/website/docs/autogen-studio/getting-started.md b/website/docs/autogen-studio/getting-started.md index 1ca954bfc6..9476ae3311 100644 --- a/website/docs/autogen-studio/getting-started.md +++ b/website/docs/autogen-studio/getting-started.md @@ -5,7 +5,7 @@ ![ARA](./img/ara_stockprices.png) -AutoGen Studio is an low-code interface built to help you rapidly prototype AI agents, enhance them with skills, compose them into workflows and interact with them to accomplish tasks. It is built on top of the [AutoGen](https://ag2ai.github.io/autogen) framework, which is a toolkit for building AI agents. +AutoGen Studio is an low-code interface built to help you rapidly prototype AI agents, enhance them with skills, compose them into workflows and interact with them to accomplish tasks. It is built on top of the [AutoGen](https://ag2ai.github.io/ag2) framework, which is a toolkit for building AI agents. Code for AutoGen Studio is on GitHub at [build-with-ag2](https://github.com/ag2ai/build-with-ag2/tree/main/samples/apps/autogen-studio) @@ -113,4 +113,4 @@ If you are building a production application, please use the AutoGen framework a ## Acknowledgements -AutoGen Studio is Based on the [AutoGen](https://ag2ai.github.io/autogen) project. It was adapted from a research prototype built in October 2023 (original credits: Gagan Bansal, Adam Fourney, Victor Dibia, Piali Choudhury, Saleema Amershi, Ahmed Awadallah, Chi Wang). +AutoGen Studio is Based on the [AutoGen](https://ag2ai.github.io/ag2) project. It was adapted from a research prototype built in October 2023 (original credits: Gagan Bansal, Adam Fourney, Victor Dibia, Piali Choudhury, Saleema Amershi, Ahmed Awadallah, Chi Wang). diff --git a/website/docs/topics/captainagent/_category_.json b/website/docs/topics/captainagent/_category_.json new file mode 100644 index 0000000000..60f390c97e --- /dev/null +++ b/website/docs/topics/captainagent/_category_.json @@ -0,0 +1,4 @@ +{ + "label": "Captain Agent", + "collapsible": true +} diff --git a/website/docs/topics/captainagent/agent_library.mdx b/website/docs/topics/captainagent/agent_library.mdx new file mode 100644 index 0000000000..c1d06a19e5 --- /dev/null +++ b/website/docs/topics/captainagent/agent_library.mdx @@ -0,0 +1,39 @@ +# Agent Library +A simple agent in agent library requires three fields: +- description: This describes the functionality of the agent. +- system_message: This provides the system message of the agent for initialization. +- name: The name of the agent. + +An example of the agent library is as follows. +``` +[ + "description": "The Python_Programming_Expert specializes in using Python's pandas and numpy libraries to manipulate large data sets, particularly focusing on creating and analyzing a new 'STEM' feature from educational datasets, and works collaboratively in a multidisciplinary team.", + "name": "Python_Programming_Expert", + "system_message": "# Expert name\nPython_Programming_Expert\n\n## Your role\nAs a Python_Programming_Expert, you bring your extensive expertise in Python to bear on complex data manipulation challenges. Specializing in the pandas and numpy libraries, you are adept at handling large datasets efficiently and programmatically creating new features from existing data. Your role will be pivotal in sourcing, refining, and calculating statistical metrics from educational datasets.\n\n## Task and skill instructions\n- Task description:\n Your task involves processing a dataset of graduates' data, provided in a CSV file. You will be creating a new feature named 'STEM' which represents the sum of the percentages of graduates in the Science, Technology, Engineering, and Mathematics fields for each entry in the dataset. Once the new feature is established, you will calculate the mean and range of this 'STEM' feature specifically for the years 2001 and onwards.\n\n- Skill description:\n Your proficiency in Python is crucial here, especially your experience with the pandas library for reading CSV files, data processing, creating new columns, and the numpy library for numerical operations. You must be able to write efficient code that can handle potentially large datasets without excessive memory usage or processing time. Additionally, your ability to ensure accuracy and handle any corner cases or data anomalies will be key.\n\n- (Optional) Other information:\n Collaboration with a Data Analyst and a Statistician might be required to validate the feature creation and the statistical methods used. Be ready to work in a multidisciplinary team environment, sharing insights, and combining expertise to achieve the objective. Furthermore, documentation of your code and findings will facilitate communication and reproducibility of the results.\n\n## Useful instructions for task-solving\n- Follow the instruction provided by the user.\n- Solve the task step by step if you need to.\n- If a plan is not provided, explain your plan first.\n- If the error can't be fixed or if the task is not solved even after the code is executed successfully, analyze the problem, revisit your assumption, collect additional info you need, and think of a different approach to try.\n- When you find an answer, verify the answer carefully. \n- Include verifiable evidence in your response if possible.\n \n## How to use code?\n- Suggest python code (in a python coding block) or shell script (in a sh coding block) for the Computer_terminal to execute.\n- When using code, you must indicate the script type in the code block.\n- Do not suggest incomplete code which requires users to modify.\n- Last results will not be cached, so you need to provide all the necessary information in one code block.\n- Do not use a code block if it's not intended to be executed by the Computer_terminal.\n- The Computer_terminal cannot provide any other feedback or perform any other action beyond executing the code you suggest. \n- The Computer_terminal can't modify your code.\n- Use 'print' function for the output when relevant. \n- Check the execution result returned by the user.\n- Do not ask users to copy and paste the result.\n- If the result indicates there is an error, fix the error and output the code again. \n- If you want the Computer_terminal to save the code in a file before executing it, put # filename: inside the code block as the first line. " +] +``` + +We provide a predefined agent library in `notebook/captainagent_expert_library.json`. + +## Adding advanced agents +We also support adding agents with advanced capability to the library, aside from agents with different system message. Just need to add a `model_path` field and any other arguments that needs to pass while initialization. For example, to add a WebSurferAgent: + +``` +[ + { + "name": "WebServing_Expert", + "description": "A helpful assistant with access to a web browser. Ask them to perform web searches, open pages, navigate to Wikipedia, answer questions from pages, and or generate summaries.", + "system_message": "", + "agent_path": "autogen/agentchat/contrib/web_surfer/WebSurferAgent", + "browser_config": { + "viewport_size": 5120, + "downloads_folder": "coding", + "request_kwargs": { + "headers": { + "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36 Edg/119.0.0.0" + } + } + } + } +] +``` diff --git a/website/docs/topics/captainagent/tool_library.mdx b/website/docs/topics/captainagent/tool_library.mdx new file mode 100644 index 0000000000..ad4edc0b07 --- /dev/null +++ b/website/docs/topics/captainagent/tool_library.mdx @@ -0,0 +1,59 @@ +# Tool Library +In CaptainAgent, tools are in the form of python functions. The agents can write code to import functions and call them according to their needs. This can significantly enhance the functionality and capabilities of the agents. + +We provide a list of tools that comes with the release of CaptainAgent. + +## Using the Built-in Tool Library +### Install dependencies +First install the requirements for running tools via pip. The requirements file is located in `autogen/agentchat/contrib/captainagent/tools/requirements.txt`. + +### Subscribe to Certain APIs +To use the provided built-in tools, it is required to obtain a Bing Search API key and RapidAPI key. +For Bing API, you can read more about how to get an API on the [Bing Web Search API](https://www.microsoft.com/en-us/bing/apis/bing-web-search-api) page. +For RapidAPI, you can [sign up](https://rapidapi.com/auth/sign-up) and subscribe to these two links([link1](https://rapidapi.com/solid-api-solid-api-default/api/youtube-transcript3), [link2](https://rapidapi.com/420vijay47/api/youtube-mp3-downloader2)). +These two apis have free billing options and there is no need to worry about extra costs for small scale runs. + +Whenever you run the tool-related code, remember to export the api keys to system variables. +```bash +export BING_API_KEY="" +export RAPID_API_KEY="" +``` +or +```python +import os +os.environ["BING_API_KEY"] = "" +os.environ["RAPID_API_KEY"] = "" +``` + +Then you are good to go. Feel free to try out the examples provided in the CaptainAgent notebook. + +## What is in the Tool Library +The tool library consists of three types of tools: math, data_analysis and information_retrieval. Its folder layout is as follows. Each 'py' file implements a tool. + +``` +tools +├── README.md +├── data_analysis +│ ├── calculate_correlation.py +│ └── ... +├── information_retrieval +│ ├── arxiv_download.py +│ ├── arxiv_search.py +│ └── ... +├── math +│ ├── calculate_circle_area_from_diameter.py +│ └── ... +└── tool_description.tsv +``` + +Tools can be imported from `tools/{category}/{tool_name}.py` with exactly the same tool name. +`tool_description.tsv` contains descriptions of tools for retrieval. + +## How Tool Library works +When an agent's description is provided, a retriever will retrieve `top_k` tool candidates from the library based on the semantic +similarity between agent description and tool description. The tool description is same as that in `tool_description.tsv`. + +After candidates are retrieved, the agent's system message will be updated with the tool candidates' information and how to call the tools. +A user proxy with the ability to execute the code will be added to the nested chat. Under the hood, this is achieved by leveraging the +[User Defined Functions](/docs/topics/code-execution/user-defined-functions) feature. A `LocalCommandLineCodeExecutor` equipped with all the functions serves as +code executor for the user proxy. diff --git a/website/docs/topics/non-openai-models/about-using-nonopenai-models.md b/website/docs/topics/non-openai-models/about-using-nonopenai-models.md index 9ca768d5e7..41134a5224 100644 --- a/website/docs/topics/non-openai-models/about-using-nonopenai-models.md +++ b/website/docs/topics/non-openai-models/about-using-nonopenai-models.md @@ -1,7 +1,7 @@ # Non-OpenAI Models AutoGen allows you to use non-OpenAI models through proxy servers that provide -an OpenAI-compatible API or a [custom model client](https://ag2ai.github.io/autogen/blog/2024/01/26/Custom-Models) +an OpenAI-compatible API or a [custom model client](https://ag2ai.github.io/ag2/blog/2024/01/26/Custom-Models) class. Benefits of this flexibility include access to hundreds of models, assigning specialized diff --git a/website/docusaurus.config.js b/website/docusaurus.config.js index 0454996323..40fcf1f665 100644 --- a/website/docusaurus.config.js +++ b/website/docusaurus.config.js @@ -242,6 +242,7 @@ module.exports = { routeBasePath: 'talks', path: './talks', showReadingTime: true, + postsPerPage: 'ALL', }, ], [ diff --git a/website/process_notebooks.py b/website/process_notebooks.py index 29af4711e1..05f199b37c 100755 --- a/website/process_notebooks.py +++ b/website/process_notebooks.py @@ -77,8 +77,6 @@ def check_quarto_bin(quarto_bin: str = "quarto") -> None: def notebooks_target_dir(website_directory: Path) -> Path: """Return the target directory for notebooks.""" - print("result-----------") - print(website_directory / "docs" / "notebooks") return website_directory / "docs" / "notebooks" @@ -459,8 +457,6 @@ def main() -> None: test_parser.add_argument("--workers", help="Number of workers to use", type=int, default=-1) args = parser.parse_args() - print("------------------------") - print(args.website_directory) if args.subcommand is None: print("No subcommand specified") sys.exit(1) diff --git a/website/talks/2024-11-11/index.mdx b/website/talks/2024-11-11/index.mdx new file mode 100644 index 0000000000..7e99128919 --- /dev/null +++ b/website/talks/2024-11-11/index.mdx @@ -0,0 +1,13 @@ +--- +title: Multi-AI Agents for Chip Design with Distilled Knowledge Debugging Graph, Task Graph Solving, and Multi-Modal Capabilities - Nov 11, 2024 +--- + +### Speakers: Chia-Tung Ho + +### Biography of the speakers: + +Chia-Tung Ho is a senior research scientist at Nvidia Research. He received his Ph.D. in electrical and computer engineering from the University of California, San Diego, USA, in 2022. Chia-Tung has several years of experience in the EDA industry. Before moving to the US, he worked for IDM and EDA companies in Taiwan, developing in-house design-for-manufacturing (DFM) flows at Macronix, as well as fastSPICE solutions at Mentor Graphics and Synopsis. During his Ph.D., he collaborated with the Design Technology Co-Optimization (DTCO) team at Synopsis and served as an AI resident at X, the Moonshot Factory (formerly Google X). His recent work focuses on developing LLM agents for chip design and integrating advanced knowledge extraction, task graph solving, and reinforcement learning techniques for debugging and design optimization. + +### Abstract: + +Hardware design presents numerous challenges due to its complexity and rapidly advancing technologies. The stringent requirements for performance, power, area, and cost (PPAC) in modern complex designs, which can include up to billions of transistors, make hardware design increasingly demanding compared to earlier generations. These challenges result in longer turnaround times (TAT) for optimizing PPAC during RTL synthesis, simulation, verification, physical design, and reliability processes. In this talk, we introduce multi-AI agents built on top of Autogen to improve efficiency and reduce TAT in the chip design process. The talk explores the integration of novel distilled knowledge debugging graphs, task graph solving, and multimodal capabilities within multi-AI agents to address tasks such as timing debugging, Verilog debugging, and Design Rule Check (DRC) code generation. Based on these studies, multi-AI agents demonstrate promising improvements in performance, productivity, and efficiency in chip design. diff --git a/website/talks/2024-11-12/index.mdx b/website/talks/2024-11-12/index.mdx new file mode 100644 index 0000000000..0c6054eb59 --- /dev/null +++ b/website/talks/2024-11-12/index.mdx @@ -0,0 +1,13 @@ +--- +title: Introducing FastAgency - the fastest way to bring AutoGen workflows to production - Nov 12, 2024 +--- + +### Speakers: Davor Runje + +### Biography of the speakers: + +Davor Runje is a seasoned software engineer, computer scientist, and serial entrepreneur with a strong background in technology and business. Most recently, he co-founded an AI startup Airt. Prior to that, he co-founded and exited from two companies. Davor is a very active member of open source community. He is a maintainer of FastStream and FastAgency and a core contributor to AutoGen. During his PhD studies, under the mentorship of Dean Rosenzweig from the University of Zagreb and Yuri Gurevich at Microsoft Research, Davor made significant contributions to programming for multiprocessor/multicore systems. He made the design, implementation, and technology transfer of a system that facilitated structured concurrency program execution, which became known as the Task Parallel Library in the .NET framework. This earned him the SSCLI and Phoenix 2005 award from Microsoft Research, recognising it as one of the top 16 international research projects. Davor is also an esteemed academic author with over 20 publications in theoretical computer science and artificial intelligence. He also holds two US patents. Between 2020 and 2024, he served as the president of the board of CISEx, the largest software industry association in Croatia, advocating for legal and tax reforms to enhance the global competitiveness of the Croatian IT sector. + +### Abstract: + +Inspired by the design and philosophy of modern Python frameworks such as FastAPI, we designed a framework that allows you to go from a working multi-agent prototype written in AutoGen to a scalable, multi-tenant application with SSO authentication hosted on the cloud in less than one hour. Depending on your needs, you can quickly build a REST-based web service running multiple workers or, in the case of an even larger scale, a distributed service built around a message broker protocol. The framework is powerful and uses complex technologies under the hood, yet it is simple to use and it requires only a few lines of code to get the desired results. FastAgency also has a simple-to-use component that allows you to render rich information in UI, giving you a better way to communicate with the end user. Last but not least, FastAgency allows you to import external REST APIs using their OpenAPI specifications and automatically build tools that can be attached to agents in just a few lines of code. In this walk, I'll walk you over the core concepts behind the framework and illustrate them with examples. diff --git a/website/talks/2024-11-18/index.mdx b/website/talks/2024-11-18/index.mdx new file mode 100644 index 0000000000..7e19a9ba00 --- /dev/null +++ b/website/talks/2024-11-18/index.mdx @@ -0,0 +1,13 @@ +--- +title: Integrating Foundation Models and Symbolic Computing for Next-Generation Robot Planning - Nov 18, 2024 +--- + +### Speakers: Yongchao Chen + +### Biography of the speakers: + +Yongchao Chen is a PhD student of Electrical Engineering at Harvard SEAS and MIT LIDS. He is currently working on Robot Planning with Foundation Models under the guidance of Prof. Chuchu Fan and Prof. Nicholas Roy at MIT and co-advised by Prof. Na Li at Harvard. He is also doing the research in AI for Physics and Materials, particularly interested in applying Robotics/Foundation Models into AI4Science. Yongchao interned at Microsoft Research in 2024 summer and has been working with MIT-IBM Watson AI Lab starting from 2023 Spring. + +### Abstract: + +State-of-the-art language models, like GPT-4o and O1, continue to face challenges in solving tasks with intricate constraints involving logic, geometry, iteration, and optimization. While it's common to query LLMs to generate a plan purely through text output, we stress the importance of integrating symbolic computing to enhance general planning capabilities. By combining LLMs with symbolic planners and solvers, or guiding LLMs to generate code for planning, we enable them to address complex decision-making tasks for both real and virtual robots. This approach extends to various applications, including task and motion planning for drones and manipulators, travel itinerary planning, website agent design, and more. diff --git a/website/talks/future_talks/index.mdx b/website/talks/future_talks/index.mdx index 580883153d..70c56c66cb 100644 --- a/website/talks/future_talks/index.mdx +++ b/website/talks/future_talks/index.mdx @@ -2,19 +2,49 @@ title: Upcoming Talks --- -## Integrating Foundation Models and Symbolic Computing for Next-Generation Robot Planning - Nov 18, 2024 +## Mosaia - The AI community’s platform for creating, sharing and deploying AI agents in a serverless cloud environment - Nov 28, 2024 -### Speakers: Yongchao Chen +### Speakers: Aaron Wong-Ellis ### Biography of the speakers: -Yongchao Chen is a PhD student of Electrical Engineering at Harvard SEAS and MIT LIDS. He is currently working on Robot Planning with Foundation Models under the guidance of Prof. Chuchu Fan and Prof. Nicholas Roy at MIT and co-advised by Prof. Na Li at Harvard. He is also doing the research in AI for Physics and Materials, particularly interested in applying Robotics/Foundation Models into AI4Science. Yongchao interned at Microsoft Research in 2024 summer and has been working with MIT-IBM Watson AI Lab starting from 2023 Spring. +Aaron Wong-Ellis is the co-founder and CTO at Mosaia. His several years of experience in the field of AI, IOT and building enterprise platforms has equipped him with the right skill set to build Mosaia. Aaron has worked as an application architect and engineer for small startups and large Fortune 100 companies like AWS. His recent work focuses on developing a platform for creating, sharing and running LLM agents in a scalable serverless cloud infrastructure. ### Abstract: -State-of-the-art language models, like GPT-4o and O1, continue to face challenges in solving tasks with intricate constraints involving logic, geometry, iteration, and optimization. While it's common to query LLMs to generate a plan purely through text output, we stress the importance of integrating symbolic computing to enhance general planning capabilities. By combining LLMs with symbolic planners and solvers, or guiding LLMs to generate code for planning, we enable them to address complex decision-making tasks for both real and virtual robots. This approach extends to various applications, including task and motion planning for drones and manipulators, travel itinerary planning, website agent design, and more. +Running multiple AI agents reliably on the cloud, can encounter numerous challenges. At Mosaia we faced the challenges head on and created a way to do this in a scalable serverless cloud environment. Allowing people to run their agents with little to no code at all. Just write up your prompts and construct your groups of agents through a browser based UI. Being able to do this, opened up many possibilities to construct and share agents with others to use on Mosaia or run locally using Autogen. Mosaia was created as a platform to not only run agents but allow prompt engineers to host and share these agents with others. Fostering a community of collaboration and creativity around building AI agents. + +### Sign Up: https://discord.gg/NrNP5ZAx?event=1308232124062503012 + + +## Make AI Agents Collaborate: Drag, Drop, and Orchestrate with Waldiez - Dec 9, 2024 + +### Speakers: Panagiotis Kasnesis + +### Biography of the speakers: + +Panagiotis Kasnesis holds a Ph.D degree in computer science from the Department of Electrical and Computer Engineering at NTUA. He received his diploma degree in chemical engineering and his M.Sc. in techno-economic systems from NTUA, in 2008 and 2013 respectively. His research interests include Machine/Deep learning, Multi-Agent Systems and IoT, while he has published more than 50 scientific articles in international journals/conferences in these fields. He is founder and CEO of Waldiez (https://waldiez.io/), co-founder and CTO of ThinGenious and serves as a senior researcher at University of West Attica. Moreover, he is a lecturer at the MSc program “Artificial Intelligence and Deep Learning” (https://aidl.uniwa.gr/) and is certified as University Ambassador, by NVIDIA Deep Learning Institute (DLI), in the tasks of Building Transformer-Based NLP Applications, and Rapid Application Development Using LLMs. + +### Abstract: + +Current LLM-based orchestration tools often lack support for multi-agent interactions, are restricted to basic communication patterns, or only provide information after the entire workflow has completed. Waldiez is an open-source workflow tool that lets you orchestrate your LLM-agents using drag-and-drop and develop complex agentic applications. It is a low-code tool that assists you design and visualize your multi-agent workflow in jupyter lab as a plugin. Wadiez runs over AG2 supporting all the communication patterns (e.g., sequential, nested and group chat), supporting several LLM-based services offered by OpenAI, Anthropic, NVIDIA NIM, local hosted models and several others. In this talk, we’ll dive into the powerful features of Wadiez, demonstrating its capabilities through real-world use cases. Join us as we explore how Wadiez can streamline complex workflows and enhance multi-agent interactions, showcasing exactly what sets it apart from other LLM-based orchestration tools. + +### Sign Up: https://discord.gg/NrNP5ZAx?event=1308233315442098197 + +## Transforming CRM with Agents: The Journey to Ully.ai's Next-Gen ERP - Dec 16, 2024 + +### Speakers: Bassil Khilo + +### Biography of the speakers: + +Bassil Khilo is the Founder & CEO of Ully.ai, an AI-powered CRM platform that automates lead research and outreach while offering full-cycle sales management tools. With a vision to redefine enterprise resource planning (ERP) systems, Bassil combines his experience in SaaS sales and entrepreneurship to build solutions that empower businesses to scale efficiently. Before founding Ully.ai, Bassil served as a Senior Account Executive at a globally renowned ERP company, where he gained hands-on experience in ERP and CRM solutions. His entrepreneurial journey includes founding ventures such as Maple Tyres, an e-commerce platform for tires in the UAE, and WarrenAI, a stock research platform that helps investors identify top-performing companies. + +### Abstract: + +In today's fast-paced digital landscape, businesses need tools that go beyond conventional CRM systems to manage their operations efficiently. At Ully.ai, we’ve built a powerful AI-driven CRM (soon evolving into a full ERP) that redefines how businesses research leads, engage with customers, and manage their sales cycle. Ully automates WhatsApp and email replies with AI Agents, enriches leads with deep insights, and personalizes outreach at scale. This talk explores the how Autogen’s Agents can streamline operations, improve customer engagement, and drive growth with RAG. + +### Sign Up: https://discord.gg/NrNP5ZAx?event=1308497335768059974 -### Sign Up: https://discord.gg/Swn3DmBV?event=1303162642298306681 ## How to follow up with the latest talks?