AI RESEARCH

Learning Query-Aware Budget-Tier Routing for Runtime Agent Memory

arXiv CS.LG

ArXi:2602.06025v2 Announce Type: replace-cross Memory is increasingly central to Large Language Model (LLM) agents operating beyond a single context window, yet most existing systems rely on offline, query-agnostic memory construction that can be inefficient and may discard query-critical information. Although runtime memory utilization is a natural alternative, prior work often incurs substantial overhead and offers limited explicit control over the performance-cost trade-off.