AI RESEARCH

SlotMemory: Object-Centric KV Memory for Streaming Long-Video Generation

arXiv CS.CV

ArXi:2605.31033v1 Announce Type: new Streaming video generation models typically rely on temporal-centric memory, which organizes historical context as raw frames, chunk segments, or unclustered tokens. This organization frequently leads to identity drift and semantic inconsistency when entities exit the frame or during interactive prompt transitions. To address these limitations, we propose SlotMemory, an object-centric Key-Value memory mechanism for streaming video diffusion.