Scenario Generation for Risk-Aware Reinforcement Learning with Probably Approximately Safe Guarantees

ArXi:2606.04812v1 Announce Type: cross Guaranteeing safety is critical to the deployment of reinforcement learning (RL) agents in the real-world, especially as policies learned using deep RL may nstrate susceptibility to transition perturbations that result in unknown or unsafe behaviour. A method of policy verification is to construct probabilistic barrier-certificates by sampling policy trajectories with respect to safety constraints, thereby demarcating known safe behaviour from unknown behaviour.