AI RESEARCH
Learning to Solve, Forgetting to Retain: Correct-Set Turnover in RLVR
arXiv CS.LG
•
ArXi:2606.03087v1 Announce Type: new Reinforcement learning with verifiable rewards (RLVR) improves the ability of large language model, yet headline accuracy gains often conceal a hidden cost: previously solved problems quietly become unsolvable as