TAGRPO: Boosting GRPO on Image-to-Video Generation with Direct Trajectory Alignment

ArXi:2601.05729v2 Announce Type: replace Recent studies have nstrated the efficacy of integrating Group Relative Policy Optimization (GRPO) into flow matching models, particularly for text-to-image and text-to-video generation. However, we find that directly applying these techniques to image-to-video (I2V) models often fails to yield consistent reward improvements. To address this limitation, we present TAGRPO, a robust post-