AI RESEARCH

WorldGUI: An Interactive Benchmark for Desktop GUI Automation from Any Starting Point

arXiv CS.AI

ArXi:2502.08047v5 Announce Type: replace Recent progress in GUI agents has substantially improved visual grounding, yet robust planning remains challenging, particularly when the environment deviates from a canonical initial state. In real applications, users often invoke assistance mid-workflow, where software may be partially configured, steps may have been executed in different orders, or the interface may differ from its default setup. Such task-state variability is pervasive but insufficiently evaluated in existing GUI benchmarks. To address this gap, we.