You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[Aug. 2023] Initial Release of BOLAA paper and implementation code!
Introduction
This is the repo for BOLAA paper.
In this paper, we create benchmark on LLM-augmented Autonoumous Agents (LAA).
We compare 6 different LAA architecture, including 5 existing intuitions and 1 new BOLAA agent.
And all those agents are paired with different LLMs to compare the performance.
BOLAA is able to communicate and orchestrate multiple specialitist agents:
We tested on two types of enviroments: the webshop navigation environment, and HotPotQA enviroment.
An example of the BOLAA web agent simulation on webshop enviroment is:
Besides BOLAA arch, we also devise five standard LAA arches, the Zeroshot (ZS), Zeroshot-Think (ZST), ReAct, PlanAct, PlanReAct as follows:
Installation
Setup the fastchat to use local open-source LLMs. Go to next step if you only test openai API.
other agent options commands can be found in test_hotpotqa.sh. The implementation code for various web agents is in hotpotqa_run
Citation
If you find our paper or code useful, please cite
@misc{liu2023bolaa,
title={BOLAA: Benchmarking and Orchestrating LLM-augmented Autonomous Agents},
author={Zhiwei Liu and Weiran Yao and Jianguo Zhang and Le Xue and Shelby Heinecke and Rithesh Murthy and Yihao Feng and Zeyuan Chen and Juan Carlos Niebles and Devansh Arpit and Ran Xu and Phil Mui and Huan Wang and Caiming Xiong and Silvio Savarese},
year={2023},
eprint={2308.05960},
archivePrefix={arXiv},
primaryClass={cs.AI}
}