Representational Capacity: Geometric Limits on Feature Representation in Transformer Language Models

ArXi:2606.02765v1 Announce Type: new Model dimension ($d_{model}$) is a fundamental hyperparameter in transformer language models, yet its role in setting the geometric limits of feature representation remains under-explored. Grounded in the Linear Representation and Superposition Hypotheses - which propose that models encode features as near-orthogonal directions in latent space - we develop a framework for estimating how many such directions a model can.