SeqRoute: Global Budget-Aware Sequential LLM Routing via Offline Reinforcement Learning

ArXi:2605.25424v1 Announce Type: cross Existing LLM routing frameworks treat queries as independent events, neglecting the sequential nature of real-world user sessions constrained by global computational budgets. This mismatch inevitably leads to budget bankruptcy: myopic routing policies exhaust resources on early interactions, forcing subsequent and often complex queries onto inadequate models. We