Skip to content

Commit

Permalink
add two sota paper
Browse files Browse the repository at this point in the history
  • Loading branch information
qiyanjun committed Feb 27, 2024
1 parent 2b5db49 commit dd02df4
Show file tree
Hide file tree
Showing 2 changed files with 13 additions and 2 deletions.
4 changes: 4 additions & 0 deletions _contents/S0-L23.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,10 @@ In this session, our readings cover:

## Required Readings:

### Large Language Model based Multi-Agents: A Survey of Progress and Challenges
+ Taicheng Guo, Xiuying Chen, Yaqi Wang, Ruidi Chang, Shichao Pei, Nitesh V. Chawla, Olaf Wiest, Xiangliang Zhang
+ Large Language Models (LLMs) have achieved remarkable success across a wide array of tasks. Due to the impressive planning and reasoning abilities of LLMs, they have been used as autonomous agents to do many tasks automatically. Recently, based on the development of using one LLM as a single planning or decision-making agent, LLM-based multi-agent systems have achieved considerable progress in complex problem-solving and world simulation. To provide the community with an overview of this dynamic field, we present this survey to offer an in-depth discussion on the essential aspects of multi-agent systems based on LLMs, as well as the challenges. Our goal is for readers to gain substantial insights on the following questions: What domains and environments do LLM-based multi-agents simulate? How are these agents profiled and how do they communicate? What mechanisms contribute to the growth of agents' capacities? For those interested in delving into this field of study, we also summarize the commonly used datasets or benchmarks for them to have convenient access. To keep researchers updated on the latest studies, we maintain an open-source GitHub repository, dedicated to outlining the research on LLM-based multi-agent systems.


### Understanding the planning of LLM agents: A survey
+ https://arxiv.org/abs/2402.02716
Expand Down
11 changes: 9 additions & 2 deletions _contents/S0-L24.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,18 +13,25 @@ tags:
In this session, our readings cover:

## Require Readings:


### Towards Efficient Generative Large Language Model Serving: A Survey from Algorithms to Systems
+ https://arxiv.org/abs/2312.15234
+ In the rapidly evolving landscape of artificial intelligence (AI), generative large language models (LLMs) stand at the forefront, revolutionizing how we interact with our data. However, the computational intensity and memory consumption of deploying these models present substantial challenges in terms of serving efficiency, particularly in scenarios demanding low latency and high throughput. This survey addresses the imperative need for efficient LLM serving methodologies from a machine learning system (MLSys) research perspective, standing at the crux of advanced AI innovations and practical system optimizations. We provide in-depth analysis, covering a spectrum of solutions, ranging from cutting-edge algorithmic modifications to groundbreaking changes in system designs. The survey aims to provide a comprehensive understanding of the current state and future directions in efficient LLM serving, offering valuable insights for researchers and practitioners in overcoming the barriers of effective LLM deployment, thereby reshaping the future of AI.

### Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling
+ https://arxiv.org/abs/2304.01373
+ How do large language models (LLMs) develop and evolve over the course of training? How do these patterns change as models scale? To answer these questions, we introduce \textit{Pythia}, a suite of 16 LLMs all trained on public data seen in the exact same order and ranging in size from 70M to 12B parameters. We provide public access to 154 checkpoints for each one of the 16 models, alongside tools to download and reconstruct their exact training dataloaders for further study. We intend \textit{Pythia} to facilitate research in many areas, and we present several case studies including novel results in memorization, term frequency effects on few-shot performance, and reducing gender bias. We demonstrate that this highly controlled setup can be used to yield novel insights toward LLMs and their training dynamics. Trained models, analysis code, training code, and training data can be found at \url{this https URL}.

## More Readings:

### OpenMoE
+ https://github.com/XueFuzhao/OpenMoE


### Langchain:
+ https://python.langchain.com/docs/get_started/introduction

### OpenMoE
+ https://github.com/XueFuzhao/OpenMoE


### LlamaIndex
Expand Down

0 comments on commit dd02df4

Please sign in to comment.