AI RESEARCH

Recovering Diversity Without Losing Alignment: A DPO Recipe for Post-Trained LLMs

arXiv CS.CL • June 04, 2026

ArXi:2605.30021v2 Announce Type: replace Many open-ended instructions have multiple valid answers that users can benefit from seeing, but post-