Bridging Structure and Language: Graph-Based Visual Reasoning for Autonomous Road Understanding

ArXi:2605.20942v1 Announce Type: new Structured road understanding of lane geometry, topology, and traffic element relationships is foundational to safe autonomous driving. While vision-language models (VLMs) offer promising semantic flexibility, they lack the geometric and relational grounding required for precise road reasoning. Conversely, traditional modular systems, e.g., HD maps and topological road graphs, provide structural precision but remain semantically rigid. To bridge this gap, we.