AI RESEARCH
What Does Vision Tool-Use Reinforcement Learning Really Learn? Disentangling Tool-Induced and Intrinsic Effects for Crop-and-Zoom
arXiv CS.CV
•
ArXi:2602.01334v2 Announce Type: replace Vision tool-use reinforcement learning (RL) can equip vision language models with visual operators such as crop-and-zoom and achieves strong performance gains, yet it remains unclear whether these gains are driven by improvements in tool use or evolving intrinsic capabilities. We