AI RESEARCH
The Misattribution Gap: When Memory Poisoning Looks Like Model Failure in Agentic AI Systems
arXiv CS.AI
•
ArXi:2605.22842v1 Announce Type: cross Multi-agent AI pipelines typically assume that agent misconduct originates from model misalignment. We identify a structural failure in this assumption, the \emph{Misattribution Gap}, where memory-layer attacks produce behaviors indistinguishable from model failure, causing defenders to apply the wrong remediation. We formalize \emph{Semantic Norm Drift} (SND) as a third path to agent misconduct, distinct from emergent misalignment and collusion.