AI RESEARCH
Reasoning Depth and Environment Complexity: A Controlled Study of RLVR Data Allocation across Logical Reasoning Tasks
arXiv CS.AI
•
ArXi:2605.26934v1 Announce Type: cross Reinforcement learning with verifiable rewards (RLVR) has become central to post-