AI RESEARCH
GroupTravelBench: Benchmarking LLM Agents on Multi-Person Travel Planning
arXiv CS.CL
•
ArXi:2605.25200v1 Announce Type: new Travel planning is a realistic task for evaluating the planning and tool-use abilities of LLM agents. However, existing benchmarks typically assume only a single user, thereby avoiding one of the most challenging aspects of real-world scenarios: an agent's ability to identify and resolve conflicts among multiple users. To address this gap, we