AI RESEARCH

MaskForge: Structure-Aware Adaptive Attacks for Jailbreaking Diffusion Large Language Models

arXiv CS.AI

ArXi:2606.04027v1 Announce Type: cross Diffusion large language models (dLLMs) generate text by iteratively denoising partially masked sequences under bidirectional context, exposing a safety surface distinct from autoregressive LLMs. Because mask tokens are native inputs and tokens are committed by confidence rather than position, harmful content can be induced through infilling and outside the monitored prefix.