Carview!

This repository contains a modified version of Windows Agent Arena (WAA) 🪟 , a scalable Windows AI agent platform for testing and benchmarking multi-modal, desktop AI agents. This modified version focuses on integration with UFO, a UI-Focused Agent for Windows OS Interaction.

💻 Deployment Guide (WSL)

We highly recommend you have a look at the deployment guide from the original WindowsAgentArena repository. Our guide here assumes you are familiar with the deployment process of the original repository. The following steps will help you set up the environment for running the UFO agent in the Windows Agent Arena.

1. Setup the repository

Clone the repository

git clone https://github.com/nice-mee/WindowsAgentArena.git

Note: If you want to run OSWorld cases, checkout the 2020-qqtcg/dev branch.
git checkout 2020-qqtcg/dev

Create a config.json file in the root of WAA repo, the API key here doesn't matter, since UFO will only use the key from its own config file.

{
    "OPENAI_API_KEY": "placeholder"
}

2. Prepare the Windows Arena Docker Image

Next, build the WinArena image locally:

cd scripts
chmod +x build-container-image.sh # (if required)
chmod +x prepare-agents.sh # (if required)
./build-container-image.sh --build-base-image true

This will create the windowsarena/winarena:latest image with the latest code from the src directory.

3. Configure UFO

You should first configure UFO with ufo/config/config.json (refer to UFO repo for details). Then copy the entire ufo folder to WindowsAgentArena/src/win-arena-container/client/.

cp -r src/win-arena-container/vm/setup/mm_agents/UFO/ufo src/win-arena-container/client/

Remember to swap the order of @staticmethod and @functools.lru_cache() in src/win-arena-container/client/ufo/llm/openai.py, this is actually due to a bug in Python 3.9 and unfortunately WAA uses Python 3.9 instead of higher versions (UFO uses Python 3.10).

4. Prepare the Windows 11 VM

4.1 Download Windows 11 Evaluation .iso file:

Visit Microsoft Evaluation Center, accept the Terms of Service, and download a Windows 11 Enterprise Evaluation (90-day trial, English, United States) ISO file [~6GB]
After downloading, rename the file to setup.iso and copy it to the directory WindowsAgentArena/src/win-arena-container/vm/image

4.2 Automatic Setup of the Windows 11 golden image:

Before running the arena, you need to prepare a new WAA snapshot (also referred as WAA golden image). This 30GB snapshot represents a fully functional Windows 11 VM with all the programs needed to run the benchmark. This VM additionally hosts a Python server which receives and executes agent commands. To learn more about the components at play, see our local and cloud components diagrams.

To prepare the gold snapshot, run once:

cd ./scripts
./run-local.sh --mode dev --prepare-image true

Please do not interfere with the VM while it is being prepared. It will automatically shut down when the provisioning process is complete.

You will find the 30GB WAA golden image in WindowsAgentArena/src/win-arena-container/vm/storage.

5. Start Initial Run

Start the initial run with this command:

./run-local.sh --mode dev --json-name "evaluation_examples_windows/test_custom.json" --agent UFO --agent-settings '{"llm_type": "azure", "llm_endpoint": "https://cloudgpt-openai.azure-api.net/openai/deployments/gpt-4o-20240513/chat/completions?api-version=2024-04-01-preview", "llm_auth": {"type": "api-key", "token": ""}}'

After booting up, wait until the device code prompt shows up, then do not enter the device code. This will block the WAA server forever as long as you don't enter the device code.

Instead, visit localhost:8006 and control the WAA Windows, do the following things:

Disable Windows Firewall.
Open Google Chrome and complete the initial setup.
Open VLC and complete the initial setup.

After completing these steps, kill the WAA client, then copy the "golden" image under storage folder to somewhere else.

How to Perform an Experiment Run

Before an experiment run, do the following things:

Replace image with previously obtained golden image
Delete the UFO logs

Then run this command:

./run-local.sh --mode dev --json-name "evaluation_examples_windows/test_full.json" --agent UFO --agent-settings '{"llm_type": "azure", "llm_endpoint": "https://cloudgpt-openai.azure-api.net/openai/deployments/gpt-4o-20240513/chat/completions?api-version=2024-04-01-preview", "llm_auth": {"type": "api-key", "token": ""}}

You probably will use a different LLM type or endpoint, so make sure to change the --agent-settings parameter accordingly.

Note: test_full.json contains all the test cases where UIA works, test_all.json contains all the test cases, even if UIA doesn't work. So please use test_full.json if OmniParser is not used.

Name		Name	Last commit message	Last commit date
Latest commit History 97 Commits
.github/workflows		.github/workflows
docs		docs
img		img
scripts		scripts
src/win-arena-container		src/win-arena-container
.dockerignore		.dockerignore
.gitattributes		.gitattributes
.gitignore		.gitignore
AgentRepoConfig.json		AgentRepoConfig.json
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
SUPPORT.md		SUPPORT.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

💻 Deployment Guide (WSL)

1. Setup the repository

2. Prepare the Windows Arena Docker Image

3. Configure UFO

4. Prepare the Windows 11 VM

4.1 Download Windows 11 Evaluation .iso file:

4.2 Automatic Setup of the Windows 11 golden image:

5. Start Initial Run

How to Perform an Experiment Run

About

Uh oh!

Releases

Packages

Languages

License

nice-mee/WindowsAgentArena

Folders and files

Latest commit

History

Repository files navigation

💻 Deployment Guide (WSL)

1. Setup the repository

2. Prepare the Windows Arena Docker Image

3. Configure UFO

4. Prepare the Windows 11 VM

4.1 Download Windows 11 Evaluation .iso file:

4.2 Automatic Setup of the Windows 11 golden image:

5. Start Initial Run

How to Perform an Experiment Run

About

Resources

License

Code of conduct

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages