AI RESEARCH
3D Gaussian Map with Open-Set Semantic Grouping for Vision-Language Navigation
arXiv CS.CV
•
ArXi:2605.26500v1 Announce Type: new Vision-language navigation (VLN) requires an agent to traverse complex 3D environments based on natural language instructions, necessitating a thorough scene understanding. While existing works equip agents with various scene representations to enhance spatial awareness, they often neglect the complex 3D geometry and rich semantics in VLN scenarios, limiting the ability to generalize across diverse and unseen environments.