AI RESEARCH

From Patches to Trajectories: Privileged Process Supervision for Software-Engineering Agents

arXiv CS.AI

ArXi:2605.21996v1 Announce Type: cross Supervised fine-tuning (SFT) on long teacher trajectories is the dominant way to instill investigation and reasoning in open software-engineering (SWE) agents. Since every retained response becomes an imitation target, the student inherits the final outcome and intermediate flaws, including ungrounded leaps and redundant loops. High-quality