What does Agent-friendly look like? Check out below demo:
The left UI is designed for 🧑🏻💻humans—prioritizing aesthetics.
The right UI is redesigned for 🤖agents—focused on clarity and functionality.
Can Computer-Use Agents offer feedback to assist Coders to Generate UI?
👨💻Humans Collaboration vs. 🤖️Coder-CUA Collaboration. Left: Most GUIs are designed by humans and optimized for user experience (e.g. aesthetics), forcing trained agents to adapt to human-oriented behaviors. Right: Our Coder-CUA Collaboration framework leverages Coder as Designer and CUA as Judge together, enabling more reliable task execution and improved usability for agents. This is AUI, Agent-friendly UI.
🔧 How does it work?
Overview of the Coder-CUA in Collaboration framework.
The process begins with the Coder as Designer, which initializes and iteratively revises the UI based on queries and feedback. In parallel, the CUA as Judge executes task-driven navigation within the testing environment, generating trajectories and error logs to evaluate task solvability. A verifier ensures functional correctness, while feedback from CUA navigation informs subsequent UI revisions. This collaboration yields a finalized agent-centric UI optimized for both functionality and execution success.
🎬 Qualitative Examples
We introduce AUI-Gym, a benchmark of 52 applications with 1560 tasks.
Each screenshot shows initial vs agent-optimized UI across individual coder.
📊 New UI improve Success Rate
We evaluate our framework using Function Completeness Rate (FC) and CUA Success Rate (SR). The results demonstrate that our Coder-CUA collaboration significantly improves both metrics compared to the baseline, especially for stronger models like GPT-5 and Gemini-3-Pro.
Coder
Method
Overall Performance
Func. Completeness (%)
CUA Success Rate (%)
GPT-5
Baseline
67.9
24.5
+ Ours
81.5
26.0
Qwen3-Coder-30B
Baseline
42.1
7.3
+ Ours
60.1
19.0
GPT-4o
Baseline
36.3
8.8
+ Ours
43.1
16.1
Gemini-3-Pro
Baseline
71.7
35.8
+ Ours
72.5
47.0
🎓 Citation
Please kindly cite our paper if you find this project helpful.
@misc{lin2025aui, title={Computer-Use Agents as Judges for Generative User Interface}, author={Kevin Qinghong Lin and Siyuan Hu and Linjie Li and Zhengyuan Yang and Lijuan Wang and Philip Torr and Mike Zheng Shou}, year={2025}, eprint={2511.15567}, archivePrefix={arXiv}, primaryClass={cs.CV}, url={https://arxiv.org/abs/2511.15567}, }