AI RESEARCH

WebGameBench: Requirement-to-Application Evaluation for Coding Agents via Browser-Native Games

arXiv CS.AI

ArXi:2605.17637v2 Announce Type: replace Coding agents are increasingly used as application builders, yet many evaluations still focus on source code, repository-level tests, or intermediate traces rather than the delivered application. We