EvoDefense: Co-Evolving Black-Box Defense with Large Language Models

ArXi:2605.31140v1 Announce Type: cross Large Language Models (LLMs) remain highly vulnerable to diverse attacks, particularly in black-box settings where the internals of target models are inaccessible. Existing black-box defenses typically rely on pre-defined filtering heuristics, which often fail to generalize to unseen attack types and target model architectures. We