AI RESEARCH

A Boundary-Layer Mechanism for One-Third Scaling in Online Softmax Classification

arXiv CS.LG

ArXi:2605.22341v1 Announce Type: new Hard-label classification is usually trained with smooth surrogate losses, most prominently softmax cross-entropy. We isolate an asymptotic mechanism by which this mismatch between smooth surrogate and discrete labels produces power-law learning curves in an online teacher-student model. After subtracting the mean logit, the thermodynamic-limit dynamics close in centered variables: a growing centered student-teacher alignment $D$ and the residual student variance $\Delta.