AI RESEARCH

OR-Space: A Full-Lifecycle Workspace Benchmark for Industrial Optimization Agents

arXiv CS.AI

ArXi:2605.28158v1 Announce Type: new Large language model (LLM) agents are increasingly used to assist with operations research (OR) modeling, yet existing OR-oriented benchmarks often reduce evaluation to one-shot translation from a self-contained problem statement into a mathematical formulation or solver program. Such settings abstract away two characteristics of real industrial OR workflows: persistent multi-artifact workspaces and multi-stage task lifecycles. We