AI RESEARCH
Generative OOD-regularized Model-based Policy Optimization
arXiv CS.AI
•
ArXi:2605.24405v1 Announce Type: cross We study sequential decision-making with offline reinforcement learning (RL). Traditional offline RL policies may result in out-of-distribution (OOD) actions when