Latent Representation Alignment for Offline Goal-Conditioned Reinforcement Learning

ArXi:2605.25740v1 Announce Type: new Offline goal-conditioned reinforcement learning (GCRL) provides a practical framework for obtaining goal-reaching policies from fixed datasets. However, learning a reliable goal-conditioned value function in long-horizon tasks remains challenging. In this paper, we identify erroneous generalization in goal-conditioned value functions as a fundamental bottleneck, and nstrate that appropriate inductive bias in the value function is crucial for addressing the bottleneck.