AI RESEARCH
The Representation-Rationalizability Tradeoff in Reward Learning
arXiv CS.LG
•
ArXi:2606.00291v1 Announce Type: cross In RLHF, each