AI RESEARCH

FloorplanQA: A Benchmark for Spatial Reasoning in LLMs using Structured Representations

arXiv CS.AI

We introduce FloorplanQA, a diagnostic benchmark for evaluating spatial reasoning in large language models (LLMs). FloorplanQA is grounded in structured representations of indoor scenes, such as (e.g., kitchens, living rooms, bedrooms, bathrooms, and others), encoded symbolically in JSON or XML layouts. Our results across a variety of frontier open-so