AI RESEARCH
Rethinking Bregman Divergences in Kronecker-Factored Optimizers
arXiv CS.LG
•
ArXi:2606.00542v1 Announce Type: new Shampoo-style optimizers approximate gradient covariance matrices using Kronecker-factored structures. Recent work~\cite{lin2026understanding} showed that such approximations can be viewed as projections under Bregman matrix divergences, leading to different Kronecker-factored preconditioners. However, it remains unclear what role the choice of divergence plays when the covariance is not exactly Kronecker-factored. We study this question through the spectrum of the covariance matrix.