DiscoverPhysics: Benchmarking LLMs for Out-of-the-Box Scientific Thinking

ArXi:2605.26087v1 Announce Type: cross Frontier LLMs now perform strongly across a wide range of physics evaluations, but it is hard to disentangle genuine reasoning from recall of established science. We