AI RESEARCH

Open-source LLMs administer maximum electric shocks in a Milgram-like obedience experiment

arXiv CS.AI

ArXi:2605.21401v1 Announce Type: cross Large language models (LLMs) are increasingly deployed as autonomous agents that make sequences of decisions over extended interactions in high-stakes domains. However, the behavior of LLMs under sustained authority pressure is still an open question with direct implications for the safety of agentic pipelines. We ran a variation of Milgram's obedience experiment on 11 open-source LLMs and found that most models reached or approached the final shock level before refusing, across 8 conditions with 30 trials per model per condition.