MedGym:A Unified Continuous-Time Benchmark for Dynamic Medical Treatment Reinforcement Learning

ArXi:2606.01028v1 Announce Type: new Medical treatment recommendation poses several challenges to reinforcement learning (RL): patient physiology evolves in continuous time, measurements and interventions are performed at irregular intervals, and treatment effects vary substantially across individuals. Existing RL formulations and simulated environments, however, are based on discrete-time MDP or POMDP abstractions with fixed or pre-specified decision intervals.