Lost in Tokenization: Fundamental Trade-offs in Graph Tokenization for Transformers

ArXi:2605.22471v1 Announce Type: new Transformers have become a central architecture for graph learning, but their application to graphs requires first choosing a tokenization: a graph-to-token map that determines which structural information is exposed at the input. In this work, we show that this choice is a fundamental component of transformer expressivity. We examine three tokenizations that serve as building blocks for many existing graph tokenizations: spectral, random-walk, and adjacency tokenizations.