Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CaptainAgent PR Part 1: Adding blog post and document #27

Merged
merged 23 commits into from
Nov 19, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
23 commits
Select commit Hold shift + click to select a range
dc99178
Update doc and blog
LeoLjl Nov 17, 2024
43c83d0
Bug fix attempt.
LeoLjl Nov 17, 2024
af3ce95
Bug fix attempt.
LeoLjl Nov 17, 2024
b79b3ef
Update website/blog/2024-11-15-CaptainAgent/index.mdx
LeoLjl Nov 18, 2024
10c7a48
Update blog.
LeoLjl Nov 18, 2024
31b9aed
Update blog
LeoLjl Nov 18, 2024
8786631
Specify details and update user interface.
LeoLjl Nov 19, 2024
68251bd
Update website/blog/2024-11-15-CaptainAgent/index.mdx
LeoLjl Nov 19, 2024
064a59b
commit
LeoLjl Nov 19, 2024
a34b9b6
Update website/blog/2024-11-15-CaptainAgent/index.mdx
LeoLjl Nov 19, 2024
d44eb74
Update website/blog/2024-11-15-CaptainAgent/index.mdx
LeoLjl Nov 19, 2024
67b5a8b
Update website/blog/2024-11-15-CaptainAgent/index.mdx
LeoLjl Nov 19, 2024
093c14f
Update website/blog/2024-11-15-CaptainAgent/index.mdx
LeoLjl Nov 19, 2024
b1622e2
Update website/blog/2024-11-15-CaptainAgent/index.mdx
LeoLjl Nov 19, 2024
43d9655
Update website/blog/2024-11-15-CaptainAgent/index.mdx
LeoLjl Nov 19, 2024
cf70aad
Update website/blog/2024-11-15-CaptainAgent/index.mdx
LeoLjl Nov 19, 2024
53b88ac
Update website/blog/2024-11-15-CaptainAgent/index.mdx
LeoLjl Nov 19, 2024
f97cc3c
Update website/blog/2024-11-15-CaptainAgent/index.mdx
LeoLjl Nov 19, 2024
7e5099d
Update website/blog/2024-11-15-CaptainAgent/index.mdx
LeoLjl Nov 19, 2024
d9ed46b
Update website/blog/2024-11-15-CaptainAgent/index.mdx
LeoLjl Nov 19, 2024
bf879aa
Update website/blog/2024-11-15-CaptainAgent/index.mdx
LeoLjl Nov 19, 2024
47c9985
Finalize.
LeoLjl Nov 19, 2024
d9519de
Update.
LeoLjl Nov 19, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
42 changes: 42 additions & 0 deletions notebook/captainagent_expert_library.json

Large diffs are not rendered by default.

3 changes: 3 additions & 0 deletions website/blog/2024-11-15-CaptainAgent/img/build.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
3 changes: 3 additions & 0 deletions website/blog/2024-11-15-CaptainAgent/img/chat.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
3 changes: 3 additions & 0 deletions website/blog/2024-11-15-CaptainAgent/img/overall.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
137 changes: 137 additions & 0 deletions website/blog/2024-11-15-CaptainAgent/index.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,137 @@
---
title: "Introducing CaptainAgent for Adaptive Team Building"
authors:
- LinxinS97
- jialeliu
- jieyuz2
tags: [LLM, GPT, AutoBuild]
---
![Illustration of how CaptainAgent build a team](img/overall.png)

**TL;DR**
- We introduce CaptainAgent, an agent equipped with the capability to adaptively assemble a team of agents through retrieval-selection-generation process to handle complex tasks via the [`nested chat`](https://ag2ai.github.io/ag2/docs/tutorial/conversation-patterns#nested-chats) conversation pattern in AG2.
- CaptainAgent supports all types of `ConversableAgents` implemented in AG2.

# Introduction

Given an ad-hoc task, dynamically assembling a group of agents capable of effectively solving the problem is a complex challenge. In many cases, we manually design and select the agents involved. In this blog, we introduce **CaptainAgent**, an intelligent agent that can autonomously assemble a team of agents tailored to meet diverse and complex task requirements.
CaptainAgent iterates over the following two steps until the problem is successfully solved.
- (**Step 1**) CaptainAgent will break down the task, recommend several roles needed for each subtask, and then create a team of agents accordingly. Each agent in the team is either generated from scratch or retrieved and selected from an agent library if provided. Each of them will also be equipped with predefined tools retrieved from a tool library if provided.
![Building workflow](img/build.png)
- (**Step 2**) For each subtask, the corresponding team of agents will jointly solve it. Once it's done, a summarization and reflection step will be triggered to generate a report based on the multi-agent conversation history. Based on the report, CaptainAgent will decide whether to adjust the subtasks and corresponding team (go to Step 1) or to terminate and output the results.
![Building workflow](img/chat.png)

The design of CaptainAgent allows it to leverage agents and tools from a pre-specified agent library and tool library. In the following section, we demonstrate how to use CaptainAgent with or without the provided library.

# Using CaptainAgent without pre-specified agent/tool libraries
CaptainAgent can serve as a drop-in replacement for the general `AssistantAgent` class in AG2. To do that we just need to add a few lines of configurations for the group chat involved.
Without the agent library and tool library, CaptainAgent will automatically generate a set of agents into a group chat.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Without the agent library and tool library, CaptainAgent will automatically generate a set of agents into a group chat.
Without the agent library and tool library, CaptainAgent will automatically generate a set of agents into a group chat.
Suggested change
Without the agent library and tool library, CaptainAgent will automatically generate a set of agents into a group chat.
When an agent library is not provided, CaptainAgent will automatically generate a set of agents from scratch to form a group chat.


```python
from autogen.agentchat.contrib.captain_agent import CaptainAgent
from autogen import UserProxyAgent

general_llm_config = {
"temperature": 0,
"config_list": autogen.config_list_from_json("OAI_CONFIG_LIST", filter_dict={"model": ["gpt-4-1106-preview"]}),
}

nested_mode_config = {
"autobuild_init_config": {
"config_file_or_env": "OAI_CONFIG_LIST",
"builder_model": "gpt-4-1106-preview",
"agent_model": "gpt-4-1106-preview",
},
# this is used to configure the autobuild building process
"autobuild_build_config": {
"default_llm_config": {"temperature": 1, "top_p": 0.95},
"code_execution_config": {"timeout": 300, "work_dir": "groupchat", "last_n_messages": 1},
"coding": True,
},
"group_chat_config": {"max_round": 15},
"group_chat_llm_config": general_llm_config.copy(),
"max_turns": 3,
}

## build agents
captain_agent = CaptainAgent(
name="captain_agent",
llm_config=general_llm_config,
nested_mode_config=nested_mode_config,
)
user_proxy = UserProxyAgent(
name="user_proxy",
code_execution_config={"use_docker": False},
)
query = "Let's play game of 24. Given 4 numbers, you need to use +, -, *, / to get 24. The numbers are 2, 2, 7, 12."
result = user_proxy.initiate_chat(captain_agent, message=query)
```

# Using CaptainAgent with pre-specified agent/tool libraries
To use CaptainAgent with pre-specified agent/tool libraries, we just need to specify the path to the agent library and tool library. The two libraries are independent, so you can choose to use one of the libraries or both.
The tool library we provide requires subscribing to specific APIs, please refer to the [docs](https://ag2ai.github.io/ag2/docs/topics/captainagent) for details. The following example does not necessarily require tool usage, so it's fine if you are subscribing to them.

To use agents from an agent library, you just need to specify a `library_path` sub-field or a `autobuild_tool_config` field in CaptainAgent's configuration.

```python
from autogen.agentchat.contrib.captain_agent import CaptainAgent
from autogen import UserProxyAgent

general_llm_config = {
"temperature": 0,
"config_list": autogen.config_list_from_json("OAI_CONFIG_LIST", filter_dict={"model": ["gpt-4-1106-preview"]}),
}

nested_mode_config = {
"autobuild_init_config": {
"config_file_or_env": "OAI_CONFIG_LIST",
"builder_model": "gpt-4-1106-preview",
"agent_model": "gpt-4-1106-preview",
},
# this is used to configure the autobuild building process
"autobuild_build_config": {
"default_llm_config": {"temperature": 1, "top_p": 0.95},
"code_execution_config": {"timeout": 300, "work_dir": "groupchat", "last_n_messages": 1},
"coding": True,
"library_path": "captainagent_expert_library.json"
},
"autobuild_tool_config": {
"tool_root": "default", # this will use the tool library we provide
"retriever": "all-mpnet-base-v2",
},
"group_chat_config": {"max_round": 10},
"group_chat_llm_config": general_llm_config.copy(),
"max_turns": 3
}

## build agents
captain_agent = CaptainAgent(
name="captain_agent",
llm_config=general_llm_config,
nested_mode_config=nested_mode_config,
)
user_proxy = UserProxyAgent(
name="user_proxy",
code_execution_config={"use_docker": False},
)
query = 'Find the stock price of Microsoft in the past 1 year and plot a line chart to show the trend. Save the line chart as "microsoft_stock_price.png".'
result = user_proxy.initiate_chat(captain_agent, message=query)
```

# Further Reading
For a detailed description of how to configure the CaptainAgent, please refer to the [document](https://ag2ai.github.io/ag2/docs/topics/captainagent).

Please refer to our [paper](https://arxiv.org/pdf/2405.19425) for more details about CaptainAgent and the proposed new team-building paradigm: adaptive build.

If you find this blog useful, please consider citing:
```
@misc{song2024adaptiveinconversationteambuilding,
title={Adaptive In-conversation Team Building for Language Model Agents},
author={Linxin Song and Jiale Liu and Jieyu Zhang and Shaokun Zhang and Ao Luo and Shijian Wang and Qingyun Wu and Chi Wang},
year={2024},
eprint={2405.19425},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2405.19425},
}
```
6 changes: 3 additions & 3 deletions website/blog/authors.yml
Original file line number Diff line number Diff line change
Expand Up @@ -18,9 +18,9 @@ yiranwu:

jialeliu:
name: Jiale Liu
title: Undergraduate student at Xidian University
url: https://leoljl.github.io
image_url: https://github.com/LeoLjl/leoljl.github.io/blob/main/profile.jpg?raw=true
title: PhD student at Pennsylvania State University
url: https://github.com/LeoLjl
image_url: https://github.com/leoljl.png

thinkall:
name: Li Jiang
Expand Down
4 changes: 4 additions & 0 deletions website/docs/topics/captainagent/_category_.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
{
"label": "Captain Agent",
"collapsible": true
}
39 changes: 39 additions & 0 deletions website/docs/topics/captainagent/agent_library.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
# Agent Library
A simple agent in agent library requires three fields:
- description: This describes the functionality of the agent.
- system_message: This provides the system message of the agent for initialization.
- name: The name of the agent.

An example of the agent library is as follows.
```
[
"description": "The Python_Programming_Expert specializes in using Python's pandas and numpy libraries to manipulate large data sets, particularly focusing on creating and analyzing a new 'STEM' feature from educational datasets, and works collaboratively in a multidisciplinary team.",
"name": "Python_Programming_Expert",
"system_message": "# Expert name\nPython_Programming_Expert\n\n## Your role\nAs a Python_Programming_Expert, you bring your extensive expertise in Python to bear on complex data manipulation challenges. Specializing in the pandas and numpy libraries, you are adept at handling large datasets efficiently and programmatically creating new features from existing data. Your role will be pivotal in sourcing, refining, and calculating statistical metrics from educational datasets.\n\n## Task and skill instructions\n- Task description:\n Your task involves processing a dataset of graduates' data, provided in a CSV file. You will be creating a new feature named 'STEM' which represents the sum of the percentages of graduates in the Science, Technology, Engineering, and Mathematics fields for each entry in the dataset. Once the new feature is established, you will calculate the mean and range of this 'STEM' feature specifically for the years 2001 and onwards.\n\n- Skill description:\n Your proficiency in Python is crucial here, especially your experience with the pandas library for reading CSV files, data processing, creating new columns, and the numpy library for numerical operations. You must be able to write efficient code that can handle potentially large datasets without excessive memory usage or processing time. Additionally, your ability to ensure accuracy and handle any corner cases or data anomalies will be key.\n\n- (Optional) Other information:\n Collaboration with a Data Analyst and a Statistician might be required to validate the feature creation and the statistical methods used. Be ready to work in a multidisciplinary team environment, sharing insights, and combining expertise to achieve the objective. Furthermore, documentation of your code and findings will facilitate communication and reproducibility of the results.\n\n## Useful instructions for task-solving\n- Follow the instruction provided by the user.\n- Solve the task step by step if you need to.\n- If a plan is not provided, explain your plan first.\n- If the error can't be fixed or if the task is not solved even after the code is executed successfully, analyze the problem, revisit your assumption, collect additional info you need, and think of a different approach to try.\n- When you find an answer, verify the answer carefully. \n- Include verifiable evidence in your response if possible.\n \n## How to use code?\n- Suggest python code (in a python coding block) or shell script (in a sh coding block) for the Computer_terminal to execute.\n- When using code, you must indicate the script type in the code block.\n- Do not suggest incomplete code which requires users to modify.\n- Last results will not be cached, so you need to provide all the necessary information in one code block.\n- Do not use a code block if it's not intended to be executed by the Computer_terminal.\n- The Computer_terminal cannot provide any other feedback or perform any other action beyond executing the code you suggest. \n- The Computer_terminal can't modify your code.\n- Use 'print' function for the output when relevant. \n- Check the execution result returned by the user.\n- Do not ask users to copy and paste the result.\n- If the result indicates there is an error, fix the error and output the code again. \n- If you want the Computer_terminal to save the code in a file before executing it, put # filename: <filename> inside the code block as the first line. "
]
```

We provide a predefined agent library in `notebook/captainagent_expert_library.json`.

## Adding advanced agents
We also support adding agents with advanced capability to the library, aside from agents with different system message. Just need to add a `model_path` field and any other arguments that needs to pass while initialization. For example, to add a WebSurferAgent:

```
[
{
"name": "WebServing_Expert",
"description": "A helpful assistant with access to a web browser. Ask them to perform web searches, open pages, navigate to Wikipedia, answer questions from pages, and or generate summaries.",
"system_message": "",
"agent_path": "autogen/agentchat/contrib/web_surfer/WebSurferAgent",
"browser_config": {
"viewport_size": 5120,
"downloads_folder": "coding",
"request_kwargs": {
"headers": {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36 Edg/119.0.0.0"
}
}
}
}
]
```
59 changes: 59 additions & 0 deletions website/docs/topics/captainagent/tool_library.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
# Tool Library
In CaptainAgent, tools are in the form of python functions. The agents can write code to import functions and call them according to their needs. This can significantly enhance the functionality and capabilities of the agents.

We provide a list of tools that comes with the release of CaptainAgent.

## Using the Built-in Tool Library
### Install dependencies
First install the requirements for running tools via pip. The requirements file is located in `autogen/agentchat/contrib/captainagent/tools/requirements.txt`.

### Subscribe to Certain APIs
To use the provided built-in tools, it is required to obtain a Bing Search API key and RapidAPI key.
For Bing API, you can read more about how to get an API on the [Bing Web Search API](https://www.microsoft.com/en-us/bing/apis/bing-web-search-api) page.
For RapidAPI, you can [sign up](https://rapidapi.com/auth/sign-up) and subscribe to these two links([link1](https://rapidapi.com/solid-api-solid-api-default/api/youtube-transcript3), [link2](https://rapidapi.com/420vijay47/api/youtube-mp3-downloader2)).
These two apis have free billing options and there is no need to worry about extra costs for small scale runs.

Whenever you run the tool-related code, remember to export the api keys to system variables.
```bash
export BING_API_KEY=""
export RAPID_API_KEY=""
```
or
```python
import os
os.environ["BING_API_KEY"] = ""
os.environ["RAPID_API_KEY"] = ""
```

Then you are good to go. Feel free to try out the examples provided in the CaptainAgent notebook.

## What is in the Tool Library
The tool library consists of three types of tools: math, data_analysis and information_retrieval. Its folder layout is as follows. Each 'py' file implements a tool.

```
tools
├── README.md
├── data_analysis
│ ├── calculate_correlation.py
│ └── ...
├── information_retrieval
│ ├── arxiv_download.py
│ ├── arxiv_search.py
│ └── ...
├── math
│ ├── calculate_circle_area_from_diameter.py
│ └── ...
└── tool_description.tsv
```

Tools can be imported from `tools/{category}/{tool_name}.py` with exactly the same tool name.
`tool_description.tsv` contains descriptions of tools for retrieval.

## How Tool Library works
When an agent's description is provided, a retriever will retrieve `top_k` tool candidates from the library based on the semantic
similarity between agent description and tool description. The tool description is same as that in `tool_description.tsv`.

After candidates are retrieved, the agent's system message will be updated with the tool candidates' information and how to call the tools.
A user proxy with the ability to execute the code will be added to the nested chat. Under the hood, this is achieved by leveraging the
[User Defined Functions](/docs/topics/code-execution/user-defined-functions) feature. A `LocalCommandLineCodeExecutor` equipped with all the functions serves as
code executor for the user proxy.
Loading