GUITestScape: Towards Open-set Evaluation on Exploratory GUI Testing

ArXi:2605.29532v1 Announce Type: cross Exploratory GUI testing is a particularly demanding setting for MLLM agents: without predefined test scripts, an agent must autonomously navigate an application and discover defects through its own interaction. However, current evaluation falls short on two fronts. First, existing benchmarks focus almost exclusively on interaction defects, leaving display defects outside the evaluation frame.