A Close Look At World Model Recovery In Supervised Fine-Tuned LLM Planners

ArXi:2606.03685v1 Announce Type: new Supervised fine-tuning (SFT) improves end-to-end classical planning in large language models (LLMs), but do these models also learn to represent and reason about the planning problems they are solving? Due to the relative complexity of classical planning problems and the challenge that end-to-end plan generation poses for LLMs, it has been difficult to explore this question.