AI RESEARCH

LRAgent: Efficient KV Cache Sharing for Multi-LoRA LLM Agents

arXiv CS.LG

ArXi:2602.01053v2 Announce Type: replace Role specialization in multi-LLM agent systems is often realized via multi-LoRA, where agents share a pretrained backbone and differ only by lightweight adapters. Despite sharing base model weights, each agent independently builds and s its own KV cache for the same long, tool-augmented trajectories, incurring substantial memory and compute overhead. Existing KV cache sharing methods largely overlook this multi-LoRA setting.