ag2ai · qingyun-wu · Nov 19, 2024 · Nov 17, 2024 · Nov 17, 2024 · Nov 17, 2024
diff --git a/notebook/captainagent_expert_library.json b/notebook/captainagent_expert_library.json
diff --git a/website/blog/2024-11-15-CaptainAgent/img/build.png b/website/blog/2024-11-15-CaptainAgent/img/build.png
diff --git a/website/blog/2024-11-15-CaptainAgent/img/chat.png b/website/blog/2024-11-15-CaptainAgent/img/chat.png
diff --git a/website/blog/2024-11-15-CaptainAgent/img/overall.png b/website/blog/2024-11-15-CaptainAgent/img/overall.png
diff --git a/website/blog/2024-11-15-CaptainAgent/index.mdx b/website/blog/2024-11-15-CaptainAgent/index.mdx
@@ -0,0 +1,137 @@
+---
+title: "Introducing CaptainAgent for Adaptive Team Building"
+authors:
+  - LinxinS97
+  - jialeliu
+  - jieyuz2
+tags: [LLM, GPT, AutoBuild]
+---
+![Illustration of how CaptainAgent build a team](img/overall.png)
+
+**TL;DR**
+- We introduce CaptainAgent, an agent equipped with the capability to adaptively assemble a team of agents through retrieval-selection-generation process to handle complex tasks via the [`nested chat`](https://ag2ai.github.io/ag2/docs/tutorial/conversation-patterns#nested-chats) conversation pattern in AG2.
+- CaptainAgent supports all types of `ConversableAgents` implemented in AG2.
+
+# Introduction
+
+Given an ad-hoc task, dynamically assembling a group of agents capable of effectively solving the problem is a complex challenge. In many cases, we manually design and select the agents involved. In this blog, we introduce **CaptainAgent**, an intelligent agent that can autonomously assemble a team of agents tailored to meet diverse and complex task requirements.
+CaptainAgent iterates over the following two steps until the problem is successfully solved.
+- (**Step 1**) CaptainAgent will break down the task, recommend several roles needed for each subtask, and then create a team of agents accordingly. Each agent in the team is either generated from scratch or retrieved and selected from an agent library if provided. Each of them will also be equipped with predefined tools retrieved from a tool library if provided.
+![Building workflow](img/build.png)
+- (**Step 2**) For each subtask, the corresponding team of agents will jointly solve it. Once it's done, a summarization and reflection step will be triggered to generate a report based on the multi-agent conversation history. Based on the report, CaptainAgent will decide whether to adjust the subtasks and corresponding team (go to Step 1) or to terminate and output the results.
+![Building workflow](img/chat.png)
+
+The design of CaptainAgent allows it to leverage agents and tools from a pre-specified agent library and tool library. In the following section, we demonstrate how to use CaptainAgent with or without the provided library.
+
+# Using CaptainAgent without pre-specified agent/tool libraries
+CaptainAgent can serve as a drop-in replacement for the general `AssistantAgent` class in AG2. To do that we just need to add a few lines of configurations for the group chat involved.
+Without the agent library and tool library, CaptainAgent will automatically generate a set of agents into a group chat.
-Without the agent library and tool library, CaptainAgent will automatically generate a set of agents into a group chat.
+Without the agent library and tool library, CaptainAgent will automatically generate a set of agents into a group chat.
-Without the agent library and tool library, CaptainAgent will automatically generate a set of agents into a group chat.
+When an agent library is not provided, CaptainAgent will automatically generate a set of agents from scratch to form a group chat.
-Without the agent library and tool library, CaptainAgent will automatically generate a set of agents into a group chat.
+Without the agent library and tool library, CaptainAgent will automatically generate a set of agents into a group chat.
-Without the agent library and tool library, CaptainAgent will automatically generate a set of agents into a group chat.
+When an agent library is not provided, CaptainAgent will automatically generate a set of agents from scratch to form a group chat.
+
+```python
+from autogen.agentchat.contrib.captain_agent import CaptainAgent
+from autogen import UserProxyAgent
+
+general_llm_config = {
+    "temperature": 0,
+    "config_list": autogen.config_list_from_json("OAI_CONFIG_LIST", filter_dict={"model": ["gpt-4-1106-preview"]}),
+}
+
+nested_mode_config = {
+    "autobuild_init_config": {
+        "config_file_or_env": "OAI_CONFIG_LIST",
+        "builder_model": "gpt-4-1106-preview",
+        "agent_model": "gpt-4-1106-preview",
+    },
+    # this is used to configure the autobuild building process
+    "autobuild_build_config": {
+        "default_llm_config": {"temperature": 1, "top_p": 0.95},
+        "code_execution_config": {"timeout": 300, "work_dir": "groupchat", "last_n_messages": 1},
+        "coding": True,
+    },
+    "group_chat_config": {"max_round": 15},
+    "group_chat_llm_config": general_llm_config.copy(),
+    "max_turns": 3,
+}
+
+## build agents
+captain_agent = CaptainAgent(
+    name="captain_agent",
+    llm_config=general_llm_config,
+    nested_mode_config=nested_mode_config,
+)
+user_proxy = UserProxyAgent(
+    name="user_proxy",
+    code_execution_config={"use_docker": False},
+)
+query = "Let's play game of 24. Given 4 numbers, you need to use +, -, *, / to get 24. The numbers are 2, 2, 7, 12."
+result = user_proxy.initiate_chat(captain_agent, message=query)
+```
+
+# Using CaptainAgent with pre-specified agent/tool libraries
+To use CaptainAgent with pre-specified agent/tool libraries, we just need to specify the path to the agent library and tool library. The two libraries are independent, so you can choose to use one of the libraries or both.
+The tool library we provide requires subscribing to specific APIs, please refer to the [docs](https://ag2ai.github.io/ag2/docs/topics/captainagent) for details. The following example does not necessarily require tool usage, so it's fine if you are subscribing to them.
+
+To use agents from an agent library, you just need to specify a `library_path` sub-field or a `autobuild_tool_config` field in CaptainAgent's configuration.
+
+```python
+from autogen.agentchat.contrib.captain_agent import CaptainAgent
+from autogen import UserProxyAgent
+
+general_llm_config = {
+    "temperature": 0,
+    "config_list": autogen.config_list_from_json("OAI_CONFIG_LIST", filter_dict={"model": ["gpt-4-1106-preview"]}),
+}
+
+nested_mode_config = {
+    "autobuild_init_config": {
+        "config_file_or_env": "OAI_CONFIG_LIST",
+        "builder_model": "gpt-4-1106-preview",
+        "agent_model": "gpt-4-1106-preview",
+    },
+    # this is used to configure the autobuild building process
+    "autobuild_build_config": {
+        "default_llm_config": {"temperature": 1, "top_p": 0.95},
+        "code_execution_config": {"timeout": 300, "work_dir": "groupchat", "last_n_messages": 1},
+        "coding": True,
+        "library_path": "captainagent_expert_library.json"
+    },
+    "autobuild_tool_config": {
+        "tool_root": "default",  # this will use the tool library we provide
+        "retriever": "all-mpnet-base-v2",
+    },
+    "group_chat_config": {"max_round": 10},
+    "group_chat_llm_config": general_llm_config.copy(),
+    "max_turns": 3
+}
+
+## build agents
+captain_agent = CaptainAgent(
+    name="captain_agent",
+    llm_config=general_llm_config,
+    nested_mode_config=nested_mode_config,
+)
+user_proxy = UserProxyAgent(
+    name="user_proxy",
+    code_execution_config={"use_docker": False},
+)
+query = 'Find the stock price of Microsoft in the past 1 year and plot a line chart to show the trend. Save the line chart as "microsoft_stock_price.png".'
+result = user_proxy.initiate_chat(captain_agent, message=query)
+```
+
+# Further Reading
+For a detailed description of how to configure the CaptainAgent, please refer to the [document](https://ag2ai.github.io/ag2/docs/topics/captainagent).
+
+Please refer to our [paper](https://arxiv.org/pdf/2405.19425) for more details about CaptainAgent and the proposed new team-building paradigm: adaptive build.
+
+If you find this blog useful, please consider citing:
+```
+@misc{song2024adaptiveinconversationteambuilding,
+      title={Adaptive In-conversation Team Building for Language Model Agents},
+      author={Linxin Song and Jiale Liu and Jieyu Zhang and Shaokun Zhang and Ao Luo and Shijian Wang and Qingyun Wu and Chi Wang},
+      year={2024},
+      eprint={2405.19425},
+      archivePrefix={arXiv},
+      primaryClass={cs.CL},
+      url={https://arxiv.org/abs/2405.19425},
+}
+```
diff --git a/website/blog/authors.yml b/website/blog/authors.yml
@@ -18,9 +18,9 @@ yiranwu:
 
 jialeliu:
   name: Jiale Liu
-  title: Undergraduate student at Xidian University
-  url: https://leoljl.github.io
-  image_url: https://github.com/LeoLjl/leoljl.github.io/blob/main/profile.jpg?raw=true
+  title: PhD student at Pennsylvania State University
+  url: https://github.com/LeoLjl
+  image_url: https://github.com/leoljl.png
 
 thinkall:
   name: Li Jiang

diff --git a/website/docs/topics/captainagent/_category_.json b/website/docs/topics/captainagent/_category_.json
@@ -0,0 +1,4 @@
+{
+    "label": "Captain Agent",
+    "collapsible": true
+}
diff --git a/website/docs/topics/captainagent/agent_library.mdx b/website/docs/topics/captainagent/agent_library.mdx
@@ -0,0 +1,39 @@
+# Agent Library
+A simple agent in agent library requires three fields:
+- description: This describes the functionality of the agent.
+- system_message: This provides the system message of the agent for initialization.
+- name: The name of the agent.
+
+An example of the agent library is as follows.
+```
+[
+    "description": "The Python_Programming_Expert specializes in using Python's pandas and numpy libraries to manipulate large data sets, particularly focusing on creating and analyzing a new 'STEM' feature from educational datasets, and works collaboratively in a multidisciplinary team.",
+    "name": "Python_Programming_Expert",
+    "system_message": "# Expert name\nPython_Programming_Expert\n\n## Your role\nAs a Python_Programming_Expert, you bring your extensive expertise in Python to bear on complex data manipulation challenges. Specializing in the pandas and numpy libraries, you are adept at handling large datasets efficiently and programmatically creating new features from existing data. Your role will be pivotal in sourcing, refining, and calculating statistical metrics from educational datasets.\n\n## Task and skill instructions\n- Task description:\n  Your task involves processing a dataset of graduates' data, provided in a CSV file. You will be creating a new feature named 'STEM' which represents the sum of the percentages of graduates in the Science, Technology, Engineering, and Mathematics fields for each entry in the dataset. Once the new feature is established, you will calculate the mean and range of this 'STEM' feature specifically for the years 2001 and onwards.\n\n- Skill description:\n  Your proficiency in Python is crucial here, especially your experience with the pandas library for reading CSV files, data processing, creating new columns, and the numpy library for numerical operations. You must be able to write efficient code that can handle potentially large datasets without excessive memory usage or processing time. Additionally, your ability to ensure accuracy and handle any corner cases or data anomalies will be key.\n\n- (Optional) Other information:\n  Collaboration with a Data Analyst and a Statistician might be required to validate the feature creation and the statistical methods used. Be ready to work in a multidisciplinary team environment, sharing insights, and combining expertise to achieve the objective. Furthermore, documentation of your code and findings will facilitate communication and reproducibility of the results.\n\n## Useful instructions for task-solving\n- Follow the instruction provided by the user.\n- Solve the task step by step if you need to.\n- If a plan is not provided, explain your plan first.\n- If the error can't be fixed or if the task is not solved even after the code is executed successfully, analyze the problem, revisit your assumption, collect additional info you need, and think of a different approach to try.\n- When you find an answer, verify the answer carefully. \n- Include verifiable evidence in your response if possible.\n    \n## How to use code?\n- Suggest python code (in a python coding block) or shell script (in a sh coding block) for the Computer_terminal to execute.\n- When using code, you must indicate the script type in the code block.\n- Do not suggest incomplete code which requires users to modify.\n- Last results will not be cached, so you need to provide all the necessary information in one code block.\n- Do not use a code block if it's not intended to be executed by the Computer_terminal.\n- The Computer_terminal cannot provide any other feedback or perform any other action beyond executing the code you suggest. \n- The Computer_terminal can't modify your code.\n- Use 'print' function for the output when relevant. \n- Check the execution result returned by the user.\n- Do not ask users to copy and paste the result.\n- If the result indicates there is an error, fix the error and output the code again. \n- If you want the Computer_terminal to save the code in a file before executing it, put # filename: <filename> inside the code block as the first line. "
+]
+```
+
+We provide a predefined agent library in `notebook/captainagent_expert_library.json`.
+
+## Adding advanced agents
+We also support adding agents with advanced capability to the library, aside from agents with different system message. Just need to add a `model_path` field and any other arguments that needs to pass while initialization. For example, to add a WebSurferAgent:
+
+```
+[
+    {
+        "name": "WebServing_Expert",
+        "description": "A helpful assistant with access to a web browser. Ask them to perform web searches, open pages, navigate to Wikipedia, answer questions from pages, and or generate summaries.",
+        "system_message": "",
+        "agent_path": "autogen/agentchat/contrib/web_surfer/WebSurferAgent",
+        "browser_config": {
+            "viewport_size": 5120,
+            "downloads_folder": "coding",
+            "request_kwargs": {
+                "headers": {
+                    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36 Edg/119.0.0.0"
+                }
+            }
+        }
+    }
+]
+```
diff --git a/website/docs/topics/captainagent/tool_library.mdx b/website/docs/topics/captainagent/tool_library.mdx
@@ -0,0 +1,59 @@
+# Tool Library
+In CaptainAgent, tools are in the form of python functions. The agents can write code to import functions and call them according to their needs. This can significantly enhance the functionality and capabilities of the agents.
+
+We provide a list of tools that comes with the release of CaptainAgent.
+
+## Using the Built-in Tool Library
+### Install dependencies
+First install the requirements for running tools via pip. The requirements file is located in `autogen/agentchat/contrib/captainagent/tools/requirements.txt`.
+
+### Subscribe to Certain APIs
+To use the provided built-in tools, it is required to obtain a Bing Search API key and RapidAPI key.
+For Bing API, you can read more about how to get an API on the [Bing Web Search API](https://www.microsoft.com/en-us/bing/apis/bing-web-search-api) page.
+For RapidAPI, you can [sign up](https://rapidapi.com/auth/sign-up) and subscribe to these two links([link1](https://rapidapi.com/solid-api-solid-api-default/api/youtube-transcript3), [link2](https://rapidapi.com/420vijay47/api/youtube-mp3-downloader2)).
+These two apis have free billing options and there is no need to worry about extra costs for small scale runs.
+
+Whenever you run the tool-related code, remember to export the api keys to system variables.
+```bash
+export BING_API_KEY=""
+export RAPID_API_KEY=""
+```
+or
+```python
+import os
+os.environ["BING_API_KEY"] = ""
+os.environ["RAPID_API_KEY"] = ""
+```
+
+Then you are good to go. Feel free to try out the examples provided in the CaptainAgent notebook.
+
+## What is in the Tool Library
+The tool library consists of three types of tools: math, data_analysis and information_retrieval. Its folder layout is as follows. Each 'py' file implements a tool.
+
+```
+tools
+├── README.md
+├── data_analysis
+│   ├── calculate_correlation.py
+│   └── ...
+├── information_retrieval
+│   ├── arxiv_download.py
+│   ├── arxiv_search.py
+│   └── ...
+├── math
+│   ├── calculate_circle_area_from_diameter.py
+│   └── ...
+└── tool_description.tsv
+```
+
+Tools can be imported from `tools/{category}/{tool_name}.py` with exactly the same tool name.
+`tool_description.tsv` contains descriptions of tools for retrieval.
+
+## How Tool Library works
+When an agent's description is provided, a retriever will retrieve `top_k` tool candidates from the library based on the semantic
+similarity between agent description and tool description. The tool description is same as that in `tool_description.tsv`.
+
+After candidates are retrieved, the agent's system message will be updated with the tool candidates' information and how to call the tools.
+A user proxy with the ability to execute the code will be added to the nested chat. Under the hood, this is achieved by leveraging the
+[User Defined Functions](/docs/topics/code-execution/user-defined-functions) feature. A `LocalCommandLineCodeExecutor` equipped with all the functions serves as
+code executor for the user proxy.