PREFINE: Preference-Based Implicit Reward and Cost Fine-Tuning for Safety Alignment

ArXi:2605.21225v1 Announce Type: new We address the problem of making a pre-trained reinforcement learning (RL) policy safety-aware by incorporating cost constraints without re