Strict Subgoal Execution: Reliable Long-Horizon Planning in Hierarchical Reinforcement Learning

ArXi:2506.21039v3 Announce Type: replace Long-horizon goal-conditioned tasks pose fundamental challenges for reinforcement learning (RL), particularly when goals are distant and rewards are sparse. While hierarchical and graph-based methods offer partial solutions, their reliance on conventional hindsight relabeling often fails to correct subgoal infeasibility, leading to inefficient high-level planning.