LLM Watermark Evasion via Bias Inversion

ArXi:2509.23019v5 Announce Type: replace-cross Watermarking offers a promising solution for detecting LLM-generated content, yet its robustness under realistic query-free (black-box) evasion remains an open challenge. Existing query-free attacks often achieve limited success or severely distort semantic meaning. We bridge this gap by theoretically analyzing rewriting-based evasion, nstrating that reducing the average conditional probability of sampling green tokens by a small margin causes the detection probability to decay exponentially.