Think Twice Before You Act: Enhancing Agent Behavioral Safety with Thought Correction

ArXi:2505.11063v3 Announce Type: replace LLM-based agents solve complex tasks through iterative reasoning, tool use, and environment interaction, where each intermediate thought directly shapes subsequent actions. Small deviations in these thoughts can therefore propagate into unsafe behaviors, yet existing guardrails typically operate only on final outputs or require intrusive model modifications. We