Spectral Souping: A Unified Framework for Online Preference Alignment

ArXi:2605.20408v1 Announce Type: new Reinforcement Learning from Human Feedback (RLHF) effectively aligns Large Language Models (LLMs) with aggregate human preferences but often fails to address the diverse and conflicting needs of individual users. To overcome this issue, we