AI RESEARCH
UltraCUA: A Foundation Model for Computer Use Agents with Hybrid Action
arXiv CS.CL
•
ArXi:2510.17790v3 Announce Type: replace-cross Computer-use agents face a fundamental limitation. They rely exclusively on primitive GUI actions (click, type, scroll), creating brittle execution chains prone to cascading failures. While API-driven agents harness rich capabilities through structured interfaces and tools, computer-use agents remain constrained to low-level visual interactions. We present UltraCUA, a foundation model that transcends this limitation through hybrid action-seamlessly unifying primitive GUI operations with high-level tool execution.