AI RESEARCH

Rooted Absorbed Prefix Trajectory Balance with Submodular Replay for GFlowNet Training

arXiv CS.AI

ArXi:2603.00454v2 Announce Type: replace-cross Generative Flow Networks (GFlowNets) enable fine-tuning large language models to approximate reward-proportional posteriors, but they remain prone to mode collapse, manifesting as prefix collapse and length bias. We attribute this to two factors: (i) weak credit assignment to early prefixes, and (ii) biased replay that induces a shifted, non-representative