GT-HarmBench: Benchmarking AI Safety Risks Through the Lens of Game Theory

ArXi:2602.12316v2 Announce Type: replace Frontier AI systems are increasingly capable and deployed in high-stakes multi-agent environments. However, existing AI safety benchmarks largely evaluate single agents, leaving multi-agent risks such as coordination failure and conflict poorly understood. We