AI RESEARCH

Phase transitions for the noisy transformer model in arbitrary dimension

arXiv stat.ML

ArXi:2606.05140v1 Announce Type: cross We study the McKean--Vlaso free energy on the unit sphere associated with the unnormalized self-attention (USA) model for noisy transformer dynamics. We prove a sharp global-minimizer dichotomy in every dimension $d\ge2$. There is a unique $\beta_*^{(d)}>0$ such that \begin{equation*} \frac{I_{d/2+1}(\beta_*^{(d)})}{I_{d/2}(\beta_*^{(dfrac1d, \end{equation*} where $I_\nu$ is the modified Bessel function of the first kind.