AI RESEARCH

Embodied3DBench: Benchmarking Low-Level Embodied Spatial Intelligence of Vision Language Models

arXiv CS.CV

ArXi:2605.29074v1 Announce Type: new Are current Vision Language Models (VLMs) ready to comprehend and reason about complex embodied interactions in 3D environments? We