Robots That Know What to Ask: Recovering Misaligned Rewards through Targeted Explanations

ArXi:2605.22986v1 Announce Type: cross Learning reward functions from nstrations assumes that nstrations provide adequate supervision over all features -- or task-relevant aspects of behavior. In practice, nstrations are often imperfect: humans may under-emphasize certain features due to cognitive load or physical difficulty, or the