Same Payload, Different Channel: Measuring Trust Asymmetry in Tool-Using Language Models

ArXi:2606.00566v1 Announce Type: new As language models take on agentic roles that span calling external APIs, reading tool outputs, and acting on instructions embedded in third-party content, their attack surface expands well beyond what users type. Whether a model treats a malicious instruction the same way regardless of where it arrives has not been systematically studied. We