AI RESEARCH
TraceGraph: Shared Decision Landscapes for Diagnosing and Improving Agent Trajectories
arXiv CS.AI
•
ArXi:2605.31308v1 Announce Type: new Agent benchmarks increasingly record rich interaction trajectories, yet evaluation often reduces each rollout to a pass rate or reward score. We