AI RESEARCH
Are Full Rollouts Necessary for On-Policy Distillation?
arXiv CS.CL
•
ArXi:2605.31490v1 Announce Type: new On-policy distillation (OPD) provides dense teacher feedback along rollouts generated by the student and has emerged as a promising post-