Robust Shielding for Safe Reinforcement Learning

ArXi:2606.00270v1 Announce Type: new Shielding is an effective approach to formally guarantee the safety of reinforcement learning agents in Marko decision processes (MDPs). However, existing shielding techniques typically assume knowledge of the safety-relevant transition dynamics - a requirement that is seldom met in practice. To address this limitation, we