AI RESEARCH

Dynamics of Stochastic Momentum with Sparse Updates in High Dimensions

arXiv CS.LG

ArXi:2605.28961v1 Announce Type: cross Existing theory of momentum assumes that gradients arrive at every parameter at a roughly constant rate, an assumption violated in practice by heavy-tailed data distributions and modern architectures. We theoretically analyze the dynamics of two tractable models of momentum under sparse updates: a least squares model with sparse inputs and a logistic regression model with a rare class.