Path-conditioned training: a principled way to rescale ReLU neural networks

ArXi:2602.19799v2 Announce Type: replace-cross Despite recent algorithmic advances, we still lack principled ways to leverage the well-documented rescaling symmetries in ReLU neural network parameters. While two properly rescaled weights implement the same function, the