AI RESEARCH

DeMuon: A Decentralized Muon for Matrix Optimization over Graphs

arXiv CS.LG

ArXi:2510.01377v2 Announce Type: replace-cross In this paper, we propose DeMuon, a method for decentralized matrix optimization over a given communication topology. DeMuon incorporates matrix orthogonalization via Newton-Schulz iterations-a technique inherited from its centralized predecessor, Muon-and employs gradient tracking to mitigate heterogeneity among local functions. Under heavy-tailed noise conditions and additional mild assumptions, we establish the iteration complexity of DeMuon for reaching an approximate stochastic stationary point.