-
Notifications
You must be signed in to change notification settings - Fork 150
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CaptainAgent PR Part 1: Adding blog post and document #27
Merged
Merged
Changes from all commits
Commits
Show all changes
23 commits
Select commit
Hold shift + click to select a range
dc99178
Update doc and blog
LeoLjl 43c83d0
Bug fix attempt.
LeoLjl af3ce95
Bug fix attempt.
LeoLjl b79b3ef
Update website/blog/2024-11-15-CaptainAgent/index.mdx
LeoLjl 10c7a48
Update blog.
LeoLjl 31b9aed
Update blog
LeoLjl 8786631
Specify details and update user interface.
LeoLjl 68251bd
Update website/blog/2024-11-15-CaptainAgent/index.mdx
LeoLjl 064a59b
commit
LeoLjl a34b9b6
Update website/blog/2024-11-15-CaptainAgent/index.mdx
LeoLjl d44eb74
Update website/blog/2024-11-15-CaptainAgent/index.mdx
LeoLjl 67b5a8b
Update website/blog/2024-11-15-CaptainAgent/index.mdx
LeoLjl 093c14f
Update website/blog/2024-11-15-CaptainAgent/index.mdx
LeoLjl b1622e2
Update website/blog/2024-11-15-CaptainAgent/index.mdx
LeoLjl 43d9655
Update website/blog/2024-11-15-CaptainAgent/index.mdx
LeoLjl cf70aad
Update website/blog/2024-11-15-CaptainAgent/index.mdx
LeoLjl 53b88ac
Update website/blog/2024-11-15-CaptainAgent/index.mdx
LeoLjl f97cc3c
Update website/blog/2024-11-15-CaptainAgent/index.mdx
LeoLjl 7e5099d
Update website/blog/2024-11-15-CaptainAgent/index.mdx
LeoLjl d9ed46b
Update website/blog/2024-11-15-CaptainAgent/index.mdx
LeoLjl bf879aa
Update website/blog/2024-11-15-CaptainAgent/index.mdx
LeoLjl 47c9985
Finalize.
LeoLjl d9519de
Update.
LeoLjl File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
Large diffs are not rendered by default.
Oops, something went wrong.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,137 @@ | ||
--- | ||
title: "Introducing CaptainAgent for Adaptive Team Building" | ||
authors: | ||
- LinxinS97 | ||
- jialeliu | ||
- jieyuz2 | ||
tags: [LLM, GPT, AutoBuild] | ||
--- | ||
![Illustration of how CaptainAgent build a team](img/overall.png) | ||
|
||
**TL;DR** | ||
- We introduce CaptainAgent, an agent equipped with the capability to adaptively assemble a team of agents through retrieval-selection-generation process to handle complex tasks via the [`nested chat`](https://ag2ai.github.io/ag2/docs/tutorial/conversation-patterns#nested-chats) conversation pattern in AG2. | ||
- CaptainAgent supports all types of `ConversableAgents` implemented in AG2. | ||
|
||
# Introduction | ||
|
||
Given an ad-hoc task, dynamically assembling a group of agents capable of effectively solving the problem is a complex challenge. In many cases, we manually design and select the agents involved. In this blog, we introduce **CaptainAgent**, an intelligent agent that can autonomously assemble a team of agents tailored to meet diverse and complex task requirements. | ||
CaptainAgent iterates over the following two steps until the problem is successfully solved. | ||
- (**Step 1**) CaptainAgent will break down the task, recommend several roles needed for each subtask, and then create a team of agents accordingly. Each agent in the team is either generated from scratch or retrieved and selected from an agent library if provided. Each of them will also be equipped with predefined tools retrieved from a tool library if provided. | ||
![Building workflow](img/build.png) | ||
- (**Step 2**) For each subtask, the corresponding team of agents will jointly solve it. Once it's done, a summarization and reflection step will be triggered to generate a report based on the multi-agent conversation history. Based on the report, CaptainAgent will decide whether to adjust the subtasks and corresponding team (go to Step 1) or to terminate and output the results. | ||
![Building workflow](img/chat.png) | ||
|
||
The design of CaptainAgent allows it to leverage agents and tools from a pre-specified agent library and tool library. In the following section, we demonstrate how to use CaptainAgent with or without the provided library. | ||
|
||
# Using CaptainAgent without pre-specified agent/tool libraries | ||
CaptainAgent can serve as a drop-in replacement for the general `AssistantAgent` class in AG2. To do that we just need to add a few lines of configurations for the group chat involved. | ||
Without the agent library and tool library, CaptainAgent will automatically generate a set of agents into a group chat. | ||
|
||
```python | ||
from autogen.agentchat.contrib.captain_agent import CaptainAgent | ||
from autogen import UserProxyAgent | ||
|
||
general_llm_config = { | ||
"temperature": 0, | ||
"config_list": autogen.config_list_from_json("OAI_CONFIG_LIST", filter_dict={"model": ["gpt-4-1106-preview"]}), | ||
} | ||
|
||
nested_mode_config = { | ||
"autobuild_init_config": { | ||
"config_file_or_env": "OAI_CONFIG_LIST", | ||
"builder_model": "gpt-4-1106-preview", | ||
"agent_model": "gpt-4-1106-preview", | ||
}, | ||
# this is used to configure the autobuild building process | ||
"autobuild_build_config": { | ||
"default_llm_config": {"temperature": 1, "top_p": 0.95}, | ||
"code_execution_config": {"timeout": 300, "work_dir": "groupchat", "last_n_messages": 1}, | ||
"coding": True, | ||
}, | ||
"group_chat_config": {"max_round": 15}, | ||
"group_chat_llm_config": general_llm_config.copy(), | ||
"max_turns": 3, | ||
} | ||
|
||
## build agents | ||
captain_agent = CaptainAgent( | ||
name="captain_agent", | ||
llm_config=general_llm_config, | ||
nested_mode_config=nested_mode_config, | ||
) | ||
user_proxy = UserProxyAgent( | ||
name="user_proxy", | ||
code_execution_config={"use_docker": False}, | ||
) | ||
query = "Let's play game of 24. Given 4 numbers, you need to use +, -, *, / to get 24. The numbers are 2, 2, 7, 12." | ||
result = user_proxy.initiate_chat(captain_agent, message=query) | ||
``` | ||
|
||
# Using CaptainAgent with pre-specified agent/tool libraries | ||
To use CaptainAgent with pre-specified agent/tool libraries, we just need to specify the path to the agent library and tool library. The two libraries are independent, so you can choose to use one of the libraries or both. | ||
The tool library we provide requires subscribing to specific APIs, please refer to the [docs](https://ag2ai.github.io/ag2/docs/topics/captainagent) for details. The following example does not necessarily require tool usage, so it's fine if you are subscribing to them. | ||
|
||
To use agents from an agent library, you just need to specify a `library_path` sub-field or a `autobuild_tool_config` field in CaptainAgent's configuration. | ||
|
||
```python | ||
from autogen.agentchat.contrib.captain_agent import CaptainAgent | ||
from autogen import UserProxyAgent | ||
|
||
general_llm_config = { | ||
"temperature": 0, | ||
"config_list": autogen.config_list_from_json("OAI_CONFIG_LIST", filter_dict={"model": ["gpt-4-1106-preview"]}), | ||
} | ||
|
||
nested_mode_config = { | ||
"autobuild_init_config": { | ||
"config_file_or_env": "OAI_CONFIG_LIST", | ||
"builder_model": "gpt-4-1106-preview", | ||
"agent_model": "gpt-4-1106-preview", | ||
}, | ||
# this is used to configure the autobuild building process | ||
"autobuild_build_config": { | ||
"default_llm_config": {"temperature": 1, "top_p": 0.95}, | ||
"code_execution_config": {"timeout": 300, "work_dir": "groupchat", "last_n_messages": 1}, | ||
"coding": True, | ||
"library_path": "captainagent_expert_library.json" | ||
}, | ||
"autobuild_tool_config": { | ||
"tool_root": "default", # this will use the tool library we provide | ||
"retriever": "all-mpnet-base-v2", | ||
}, | ||
"group_chat_config": {"max_round": 10}, | ||
"group_chat_llm_config": general_llm_config.copy(), | ||
"max_turns": 3 | ||
} | ||
|
||
## build agents | ||
captain_agent = CaptainAgent( | ||
name="captain_agent", | ||
llm_config=general_llm_config, | ||
nested_mode_config=nested_mode_config, | ||
) | ||
user_proxy = UserProxyAgent( | ||
name="user_proxy", | ||
code_execution_config={"use_docker": False}, | ||
) | ||
query = 'Find the stock price of Microsoft in the past 1 year and plot a line chart to show the trend. Save the line chart as "microsoft_stock_price.png".' | ||
result = user_proxy.initiate_chat(captain_agent, message=query) | ||
``` | ||
|
||
# Further Reading | ||
For a detailed description of how to configure the CaptainAgent, please refer to the [document](https://ag2ai.github.io/ag2/docs/topics/captainagent). | ||
|
||
Please refer to our [paper](https://arxiv.org/pdf/2405.19425) for more details about CaptainAgent and the proposed new team-building paradigm: adaptive build. | ||
|
||
If you find this blog useful, please consider citing: | ||
``` | ||
@misc{song2024adaptiveinconversationteambuilding, | ||
title={Adaptive In-conversation Team Building for Language Model Agents}, | ||
author={Linxin Song and Jiale Liu and Jieyu Zhang and Shaokun Zhang and Ao Luo and Shijian Wang and Qingyun Wu and Chi Wang}, | ||
year={2024}, | ||
eprint={2405.19425}, | ||
archivePrefix={arXiv}, | ||
primaryClass={cs.CL}, | ||
url={https://arxiv.org/abs/2405.19425}, | ||
} | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
{ | ||
"label": "Captain Agent", | ||
"collapsible": true | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,39 @@ | ||
# Agent Library | ||
A simple agent in agent library requires three fields: | ||
- description: This describes the functionality of the agent. | ||
- system_message: This provides the system message of the agent for initialization. | ||
- name: The name of the agent. | ||
|
||
An example of the agent library is as follows. | ||
``` | ||
[ | ||
"description": "The Python_Programming_Expert specializes in using Python's pandas and numpy libraries to manipulate large data sets, particularly focusing on creating and analyzing a new 'STEM' feature from educational datasets, and works collaboratively in a multidisciplinary team.", | ||
"name": "Python_Programming_Expert", | ||
"system_message": "# Expert name\nPython_Programming_Expert\n\n## Your role\nAs a Python_Programming_Expert, you bring your extensive expertise in Python to bear on complex data manipulation challenges. Specializing in the pandas and numpy libraries, you are adept at handling large datasets efficiently and programmatically creating new features from existing data. Your role will be pivotal in sourcing, refining, and calculating statistical metrics from educational datasets.\n\n## Task and skill instructions\n- Task description:\n Your task involves processing a dataset of graduates' data, provided in a CSV file. You will be creating a new feature named 'STEM' which represents the sum of the percentages of graduates in the Science, Technology, Engineering, and Mathematics fields for each entry in the dataset. Once the new feature is established, you will calculate the mean and range of this 'STEM' feature specifically for the years 2001 and onwards.\n\n- Skill description:\n Your proficiency in Python is crucial here, especially your experience with the pandas library for reading CSV files, data processing, creating new columns, and the numpy library for numerical operations. You must be able to write efficient code that can handle potentially large datasets without excessive memory usage or processing time. Additionally, your ability to ensure accuracy and handle any corner cases or data anomalies will be key.\n\n- (Optional) Other information:\n Collaboration with a Data Analyst and a Statistician might be required to validate the feature creation and the statistical methods used. Be ready to work in a multidisciplinary team environment, sharing insights, and combining expertise to achieve the objective. Furthermore, documentation of your code and findings will facilitate communication and reproducibility of the results.\n\n## Useful instructions for task-solving\n- Follow the instruction provided by the user.\n- Solve the task step by step if you need to.\n- If a plan is not provided, explain your plan first.\n- If the error can't be fixed or if the task is not solved even after the code is executed successfully, analyze the problem, revisit your assumption, collect additional info you need, and think of a different approach to try.\n- When you find an answer, verify the answer carefully. \n- Include verifiable evidence in your response if possible.\n \n## How to use code?\n- Suggest python code (in a python coding block) or shell script (in a sh coding block) for the Computer_terminal to execute.\n- When using code, you must indicate the script type in the code block.\n- Do not suggest incomplete code which requires users to modify.\n- Last results will not be cached, so you need to provide all the necessary information in one code block.\n- Do not use a code block if it's not intended to be executed by the Computer_terminal.\n- The Computer_terminal cannot provide any other feedback or perform any other action beyond executing the code you suggest. \n- The Computer_terminal can't modify your code.\n- Use 'print' function for the output when relevant. \n- Check the execution result returned by the user.\n- Do not ask users to copy and paste the result.\n- If the result indicates there is an error, fix the error and output the code again. \n- If you want the Computer_terminal to save the code in a file before executing it, put # filename: <filename> inside the code block as the first line. " | ||
] | ||
``` | ||
|
||
We provide a predefined agent library in `notebook/captainagent_expert_library.json`. | ||
|
||
## Adding advanced agents | ||
We also support adding agents with advanced capability to the library, aside from agents with different system message. Just need to add a `model_path` field and any other arguments that needs to pass while initialization. For example, to add a WebSurferAgent: | ||
|
||
``` | ||
[ | ||
{ | ||
"name": "WebServing_Expert", | ||
"description": "A helpful assistant with access to a web browser. Ask them to perform web searches, open pages, navigate to Wikipedia, answer questions from pages, and or generate summaries.", | ||
"system_message": "", | ||
"agent_path": "autogen/agentchat/contrib/web_surfer/WebSurferAgent", | ||
"browser_config": { | ||
"viewport_size": 5120, | ||
"downloads_folder": "coding", | ||
"request_kwargs": { | ||
"headers": { | ||
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36 Edg/119.0.0.0" | ||
} | ||
} | ||
} | ||
} | ||
] | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,59 @@ | ||
# Tool Library | ||
In CaptainAgent, tools are in the form of python functions. The agents can write code to import functions and call them according to their needs. This can significantly enhance the functionality and capabilities of the agents. | ||
|
||
We provide a list of tools that comes with the release of CaptainAgent. | ||
|
||
## Using the Built-in Tool Library | ||
### Install dependencies | ||
First install the requirements for running tools via pip. The requirements file is located in `autogen/agentchat/contrib/captainagent/tools/requirements.txt`. | ||
|
||
### Subscribe to Certain APIs | ||
To use the provided built-in tools, it is required to obtain a Bing Search API key and RapidAPI key. | ||
For Bing API, you can read more about how to get an API on the [Bing Web Search API](https://www.microsoft.com/en-us/bing/apis/bing-web-search-api) page. | ||
For RapidAPI, you can [sign up](https://rapidapi.com/auth/sign-up) and subscribe to these two links([link1](https://rapidapi.com/solid-api-solid-api-default/api/youtube-transcript3), [link2](https://rapidapi.com/420vijay47/api/youtube-mp3-downloader2)). | ||
These two apis have free billing options and there is no need to worry about extra costs for small scale runs. | ||
|
||
Whenever you run the tool-related code, remember to export the api keys to system variables. | ||
```bash | ||
export BING_API_KEY="" | ||
export RAPID_API_KEY="" | ||
``` | ||
or | ||
```python | ||
import os | ||
os.environ["BING_API_KEY"] = "" | ||
os.environ["RAPID_API_KEY"] = "" | ||
``` | ||
|
||
Then you are good to go. Feel free to try out the examples provided in the CaptainAgent notebook. | ||
|
||
## What is in the Tool Library | ||
The tool library consists of three types of tools: math, data_analysis and information_retrieval. Its folder layout is as follows. Each 'py' file implements a tool. | ||
|
||
``` | ||
tools | ||
├── README.md | ||
├── data_analysis | ||
│ ├── calculate_correlation.py | ||
│ └── ... | ||
├── information_retrieval | ||
│ ├── arxiv_download.py | ||
│ ├── arxiv_search.py | ||
│ └── ... | ||
├── math | ||
│ ├── calculate_circle_area_from_diameter.py | ||
│ └── ... | ||
└── tool_description.tsv | ||
``` | ||
|
||
Tools can be imported from `tools/{category}/{tool_name}.py` with exactly the same tool name. | ||
`tool_description.tsv` contains descriptions of tools for retrieval. | ||
|
||
## How Tool Library works | ||
When an agent's description is provided, a retriever will retrieve `top_k` tool candidates from the library based on the semantic | ||
similarity between agent description and tool description. The tool description is same as that in `tool_description.tsv`. | ||
|
||
After candidates are retrieved, the agent's system message will be updated with the tool candidates' information and how to call the tools. | ||
A user proxy with the ability to execute the code will be added to the nested chat. Under the hood, this is achieved by leveraging the | ||
[User Defined Functions](/docs/topics/code-execution/user-defined-functions) feature. A `LocalCommandLineCodeExecutor` equipped with all the functions serves as | ||
code executor for the user proxy. |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.