AI RESEARCH

TraceGraph: Shared Decision Landscapes for Diagnosing and Improving Agent Trajectories

arXiv CS.AI

ArXi:2605.31308v1 Announce Type: new Agent benchmarks increasingly record rich interaction trajectories, yet evaluation often reduces each rollout to a pass rate or reward score. We