AI RESEARCH
Less is More: Early Stopping Rollout for On-Policy Distillation
arXiv CS.AI
•
ArXi:2605.27028v1 Announce Type: cross On-policy distillation has recently emerged as a promising alternative to standard sequence-level imitation