AI RESEARCH
Rooted Absorbed Prefix Trajectory Balance with Submodular Replay for GFlowNet Training
arXiv CS.AI
•
ArXi:2603.00454v2 Announce Type: replace-cross Generative Flow Networks (GFlowNets) enable fine-tuning large language models to approximate reward-proportional posteriors, but they remain prone to mode collapse, manifesting as prefix collapse and length bias. We attribute this to two factors: (i) weak credit assignment to early prefixes, and (ii) biased replay that induces a shifted, non-representative