-
Notifications
You must be signed in to change notification settings - Fork 164
Issues: THUDM/AgentBench
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
[Assistance] OS task retrun infos
bug
Something isn't working
help wanted
Extra attention is needed
#173
opened Nov 19, 2024 by
xiaxiaxiatengxi
[Bug/Assistance] '{"detail":"Error: Task does not exist"}', 400, 'webshop-std'
bug
Something isn't working
help wanted
Extra attention is needed
#170
opened Oct 23, 2024 by
AlphaLee1113
[Data availability] Model trajectories
enhancement
New feature or request
#169
opened Oct 20, 2024 by
felipemaiapolo
[Assistance] 如何实现demo视频中的效果
bug
Something isn't working
help wanted
Extra attention is needed
#168
opened Oct 9, 2024 by
XGJ111
[Feature] 关于游戏场景docker的一些疑问,http://nginx.org/r/error_log,相关报错,请问这个是docker没有连接外网导致的吗
enhancement
New feature or request
#166
opened Sep 10, 2024 by
kai0705
[Bug/Assistance]
bug
Something isn't working
help wanted
Extra attention is needed
#154
opened Jul 30, 2024 by
matinaghaei
Could you please upload the dockerfile?
bug
Something isn't working
help wanted
Extra attention is needed
#152
opened Jul 25, 2024 by
HCHCXY
[Bug/Assistance] A lot of os-std tasks are impossible
bug
Something isn't working
help wanted
Extra attention is needed
#151
opened Jul 25, 2024 by
rjmoss
[Bug/Assistance] how to use local model to replace gpt3.5?
bug
Something isn't working
help wanted
Extra attention is needed
#150
opened Jul 19, 2024 by
lambda7xx
[Bug/Assistance] card game 测评 开源大模型 运行报错 failed with error INTERACT_FAILED {"detail":"Error: Worker not responding\n"}
bug
Something isn't working
help wanted
Extra attention is needed
#147
opened Jul 9, 2024 by
moon-fall
通过fastchat部署本地模型遇到的问题
bug
Something isn't working
help wanted
Extra attention is needed
#146
opened Jul 4, 2024 by
YinSonglin1997
DBbench-std task with error "Can't connect to MySQL server"
bug
Something isn't working
help wanted
Extra attention is needed
#145
opened Jun 27, 2024 by
realbillbao
urgent - if there one of the problems throws an error , why does the overall.json not show up??
bug
Something isn't working
help wanted
Extra attention is needed
#144
opened Jun 21, 2024 by
ishapuri
Would llama3 wizardlm2 and other latest models be tested and published in leaderboard? 请求添加llama3 wizardlm等24年4-5月大模型的测试结果
enhancement
New feature or request
#136
opened May 11, 2024 by
dercaft
[Feature] 请问每个任务的分是怎么计算的呢?比如OS任务中得到的只是一个准确率,但是在论文中Table3每个任务对应的都是分数,这中间的映射过程我在文中并没有找到,可以提示一下吗
enhancement
New feature or request
#135
opened May 10, 2024 by
lonerFarea
请问支持使用openai的tool_call接口进行测试吗?
enhancement
New feature or request
#132
opened Apr 9, 2024 by
Maybewuss
Excellent Job! Well, no offense, it seems LLM-Bench rather than AgentBench in essence.
enhancement
New feature or request
#130
opened Mar 26, 2024 by
Konisberg
[Bug/Assistance] mind2web的unknown是怎么回事?
bug
Something isn't working
help wanted
Extra attention is needed
#129
opened Mar 24, 2024 by
Tangent-90C
OS std 测试集结果
bug
Something isn't working
help wanted
Extra attention is needed
#128
opened Mar 18, 2024 by
xqun3
[Bug/Assistance] - Reproducing Results on Alfworld (HH) (vs. ReAct paper)
bug
Something isn't working
help wanted
Extra attention is needed
#127
opened Mar 9, 2024 by
ai-nikolai
Benchmark for mistral models
enhancement
New feature or request
#122
opened Mar 1, 2024 by
mingxuan-he
Previous Next
ProTip!
Updated in the last three days: updated:>2024-11-29.