AI RESEARCH
LRAgent: Efficient KV Cache Sharing for Multi-LoRA LLM Agents
arXiv CS.LG
•
ArXi:2602.01053v2 Announce Type: replace Role specialization in multi-LLM agent systems is often realized via multi-LoRA, where agents share a pretrained backbone and differ only by lightweight adapters. Despite sharing base model weights, each agent independently builds and s its own KV cache for the same long, tool-augmented trajectories, incurring substantial memory and compute overhead. Existing KV cache sharing methods largely overlook this multi-LoRA setting.