You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
cd webarena
python pipeline.py --website "shopping"# choose one from ['shopping', 'shopping_admin', 'reddit', 'gitlab', 'map']
To run AWM on Mind2Web under mind2web/:
cd mind2web
python pipeline.py --setup "offline"# or "online"
Check webarena/ and mind2web/ folders for more detailed instructions about environment and data setups.
What is Agent Workflow Memory? 🧠
Agent Workflow Memory (AWM) proposes to induce, integrate, and utilize workflows via an agent memory.
A workflow is usually a common sub-routine in solving tasks, with example-specific contexts being abstracted out.
AWM can operate in both offline and online settings:
offline (left): when additional (e.g., training) examples are available, agents induce workflows from ground-truth annotated examples
online (right): without any auxiliary data, agents induce workflows from past experiences on the fly.
How does AWM work? 📈
On WebArena
We achieve the state-of-the-art result -- 35.6% success rate.
Check the code in ./webarena/ directory.
On Mind2Web
We also get the best scores among text-based agents. Particularly, AWM offline effectively generalizes across a wide range of tasks, websites, and domains.
Check the code in ./mind2web/ directory.
Citation 📜
@inproceedings{awm2024wang,
title = {Agent Workflow Memory},
author = {Wang, Zhiruo anf Mao, Jiayuan, and Fried, Daniel and Neubig, Graham},
journal={arXiv preprint arXiv:2409.07429},
year = {2024},
}