InteractScience: Programmatic and Visually-Grounded Evaluation of Interactive Scientific Demonstration Code Generation

ArXi:2510.09724v2 Announce Type: replace-cross Large Language Models (LLMs) are increasingly capable of generating complete applications from natural language instructions, creating new opportunities in science and education. In these domains, interactive scientific nstrations are particularly valuable for explaining concepts, ing new teaching methods, and presenting research findings. Generating such nstrations requires models to combine accurate scientific knowledge with the ability to implement interactive front-end code that behaves correctly and responds to user actions.