On the Relationship Between Activation Outliers and Feature Death in Sparse Autoencoders

ArXi:2605.31518v1 Announce Type: new Sparse autoencoders (SAEs) decompose neural network activations into interpretable features, but many learned features never activate, a problem called feature death that wastes dictionary capacity and can re