AI RESEARCH
Parallel Tempering Initial Sampling in Inference-Time Reward Alignment
arXiv CS.LG
•
ArXi:2605.30991v1 Announce Type: new Inference-time reward alignment steers pretrained diffusion and flow-based generative models to satisfy user-specified rewards without re