AI RESEARCH
Learning Fine-grained Parameter Sharing via Sparse Tensor Decomposition
arXiv CS.LG
•
ArXi:2411.09816v4 Announce Type: replace Large neural networks achieve state-of-the-art performance on many tasks, yet their sheer size hinders deployment on resource-constrained devices. Among existing compression approaches, cross-layer parameter sharing remains relatively unexplored for transformer models. In this paper, we