AI RESEARCH

Quality-Diversity Evolution for Discovering Diverse Vulnerabilities in LLM Safety

arXiv CS.LG

ArXi:2606.00801v1 Announce Type: cross Current approaches to LLM adversarial testing suffer from coverage gaps: manual red-teaming does not scale, LLM-as-attacker methods exhibit mode collapse, and gradient-based approaches produce uninterpretable gibberish. We