LoopFM: Learning frOm HistOrical RePresentations of Foundation Model for Recommendation

ArXi:2605.29280v1 Announce Type: cross Knowledge distillation (KD) transfers a single scalar prediction from a large foundation model (FM) to compact vertical models (VMs), suffering from diminishing transfer ratio -- the fraction of FM improvement captured by the VM -- as a single scalar cannot convey the rich intermediate knowledge that larger FMs learn.