AI RESEARCH

SemanticZip: A Pilot Framework for Lossy Text Compression with LLMs as Semantic Decompressors

arXiv CS.AI

ArXi:2605.24541v1 Announce Type: cross Text compression for large language model (LLM) systems is usually framed as token deletion, retrieval, summarization, or exact reconstruction. We study a aggressive but explicitly lossy setting: compress text into compact codes that an LLM can expand into task-relevant meaning. We call this setting SemanticZip.