Sliding Windows Forget: Why Long-Running LLM Apps Need Memory Policy
Towards AI
•
Generative AI
AI Research
Most long-running LLM failures are not pure reasoning failures. They are state-selection failures: the next model call gets incomplete, stale, or irrelevant context. In short chats, appending recent messages often works. In persistent sessions, that breaks down because durable facts vanish, stale updates return, and routine traffic consumes the prompt budget. The real question becomes: Which pieces of prior state deserve to be in the next model call? I built LLM-Context-Optimization-Engine to explore that question.