AI RESEARCH
Quality-Diversity Evolution for Discovering Diverse Vulnerabilities in LLM Safety
arXiv CS.LG
•
ArXi:2606.00801v1 Announce Type: cross Current approaches to LLM adversarial testing suffer from coverage gaps: manual red-teaming does not scale, LLM-as-attacker methods exhibit mode collapse, and gradient-based approaches produce uninterpretable gibberish. We