AdaMerge: Salience-Aware Adaptive Token Merging for Training-Free Acceleration of Vision Transformers

ArXi:2605.27465v1 Announce Type: cross The quadratic cost of self-attention in Vision Transformers (ViTs) constitutes a fundamental bottleneck for practical deployment, motivating a vibrant line of research on token reduction. Among existing approaches, token merging (ToMe) has emerged as an elegant