You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
ScreenSpot-Pro: GUI Grounding for Professional High-Resolution Computer Use
📢 Updates
(May 19, 2025) 🔥🔥🔥 We're excited to introduce our new model, SE-GUI! It achieves 47.2% accuracy with a 7B model and 35.9% with a 3B model — trained on just 3k open-source samples. Check out the arxiv paper!
(Jan 4, 2025) The paper and dataset are released. Please also check out ScreenSpot-v2-variants which contains more instruction styles (original instruction, action, target UI description, and negative instructions).
Set Up
Before you begin, ensure your environment variables are set:
OPENAI_API_KEY: Your OpenAI API key.
Evaluation
Use the shell scripts to launch the evaluation.
bash run_ss_pro.sh
or
bash run_ss_pro_cn.sh
Citation
Please consider citing if you find our work useful:
@inproceedings{
li2025screenspotpro,
title={ScreenSpot-Pro: {GUI} Grounding for Professional High-Resolution Computer Use},
author={Kaixin Li and Meng Ziyang and Hongzhan Lin and Ziyang Luo and Yuchen Tian and Jing Ma and Zhiyong Huang and Tat-Seng Chua},
booktitle={Workshop on Reasoning and Planning for Large Language Models},
year={2025},
url={https://openreview.net/forum?id=XaKNDIAHas}
}
About
GUI Grounding for Professional High-Resolution Computer Use