AI RESEARCH

When Both Layers Learn: Training Dynamics of Representing Linear Models via ReLU Networks

arXiv CS.LG

ArXi:2606.04476v1 Announce Type: new In this paper, we study the gradient descent dynamics for jointly