Streaming Reinforcement Learning under Partial Observability with Real-Time Recurrent Learning

ArXi:2605.24709v1 Announce Type: new Streaming reinforcement learning has emerged as an online learning paradigm that conforms to the restrictions of natural learning agents that process data incrementally, i.e. with a batch size of 1 and no replay buffer. While streaming RL has recently been shown to scale with deep function approximation with full observability, partially observable settings have remained out of reach.