DiscoForcing: A Unified Framework for Real-Time Audio-Driven Character Control with Diffusion Forcing

ArXi:2605.28491v1 Announce Type: new We study real-time audio-responsive character control as a deployment-faithful problem: strictly causal, bounded-latency streaming that must generate coherent full-body motion at interactive frame rates while the audio condition can change abruptly, including tempo shifts, drops, or user edits. Prior music-to-motion systems are largely optimized for offline generation with global context, and degrade in streaming rollouts where conditioning history becomes stale or unreliable. We.