AI RESEARCH
SENIOR: Efficient Query Selection and Preference-Guided Exploration in Preference-based Reinforcement Learning
arXiv CS.AI
•
ArXi:2506.14648v2 Announce Type: replace-cross Preference-based Reinforcement Learning (PbRL) methods provide a solution to avoid reward engineering by learning reward models based on human preferences. However, poor feedback- and sample- efficiency still remain the problems that hinder the application of PbRL.