AI RESEARCH
E$^3$C: Video Generation with 3D Environmental Memory and Ego-Exo Human Pose Control
arXiv CS.AI
•
ArXi:2605.26316v1 Announce Type: cross Controllable and physically grounded egocentric video generation is essential for embodied agents to reason about how their own and others' actions manifest and change the world. Compared to generic video synthesis, egocentric generation is especially challenging: the camera is tightly coupled to the actor, leading to rapid viewpoint changes and frequent self-occlusions; the underlying actions are subtle, articulated, and often only partially visible; and both the people and the scene state must evolve consistently with the specified controls.