Performance and Complexity Trade-off Optimization of Speech Models During Training

ArXi:2601.13704v3 Announce Type: replace-cross In speech machine learning, neural network models are typically designed by choosing an architecture with fixed layer sizes and structure. These models are then trained to maximize performance on metrics aligned with the task's objective. While the overall architecture is usually guided by prior knowledge of the task, the sizes of individual layers are often chosen heuristically.