AI RESEARCH

Agent Planning Benchmark: A Diagnostic Framework for Planning Capabilities in LLM Agents

arXiv CS.CL

ArXi:2606.04874v1 Announce Type: new Planning is central to LLM agents: before acting, an agent must decompose goals, select tools, reason over constraints, and decide when a task is infeasible. Yet existing agent evaluations often report only end-to-end success, making it difficult to determine whether failures stem from planning or execution. We