Learning to Search and Searching to Learn for Generalization in Planning

ArXi:2605.25720v1 Announce Type: new Combinatorial generalization remains a central challenge in Deep Reinforcement Learning (DRL). Classical planning provides a simple yet challenging setting to study this problem through explicit relational descriptions, without requiring learning from perception. In sparse-reward domains, standard RL exploration via real-time search is ineffective, and learning-based planning methods often rely on expert nstrations, hindsight relabeling, or random walks from the goal state.