AI RESEARCH

BenchEvolver: Frontier Task Synthesis via Solution-Centric Evolution

arXiv CS.AI

ArXi:2606.01286v1 Announce Type: cross The rapid progress of frontier large language models has led to widespread benchmark saturation, limiting the ability of existing datasets to differentiate model capabilities or provide useful