Dreaming Of Others: Latent Teammate Modeling In World Models For Multi-Agent Reinforcement Learning

ArXi:2605.31361v1 Announce Type: cross In cooperative multi-agent reinforcement learning (MARL), agents must coordinate with partners whose internal policies and intentions are not directly observable. While world models such as Dreamer have nstrated strong generalization and sample efficiency in single-agent settings, their application to MARL remains limited by an inability to handle teammate-induced uncertainty. We propose a new perspective: treat teammates as structured, learnable components within the agent's world model. We.