AI RESEARCH
Is Your LLM Overcharging You? Tokenization, Transparency, and Incentives
arXiv CS.AI
•
ArXi:2505.21627v4 Announce Type: replace-cross State-of-the-art large language models require specialized hardware and substantial energy to operate. As a consequence, cloud-based services that provide access to large language models have become very popular. In these services, the price users pay for an output provided by a model depends on the number of tokens the model uses to generate it: they pay a fixed price per token.