Planner-Centric Reinforcement Learning for Deep Research with Structure-Aware Reward

ArXi:2605.30824v1 Announce Type: new Deep research tasks require LLMs to plan what to investigate, retrieve evidence, and synthesize long-form answers across multiple branches of inquiry. Existing