AI RESEARCH

Post-Training is About States, Not Tokens: A State Distribution View of SFT, RL, and On-Policy Distillation

arXiv CS.AI • May 23, 2026

ArXi:2605.22731v1 Announce Type: cross Large language model post-

Read Full Article