AI RESEARCH
AIRGuard: Guarding Agent Actions with Runtime Authority Control
arXiv CS.AI
•
ArXi:2605.28914v1 Announce Type: cross Tool-using language agents turn model decisions into external side effects: they read files, run scripts, call APIs, send messages, and invoke Model Context Protocol tools. This makes agent attacks different from jailbreaks. The harmful step is often not an obviously forbidden output, but an ordinary executable action that becomes unsafe because attacker-controlled context steers authorized access against the user's interest.