Benchmarking and Enhancing VLM for Compressed Image Understanding

ArXi:2512.20901v2 Announce Type: replace With the rapid development of Vision-Language Models (VLMs) and the growing demand for their applications, efficient compression of the image inputs has become increasingly important. Existing VLMs predominantly digest and understand high-bitrate compressed images, while their ability to interpret low-bitrate compressed images has yet to be explored by far. In this paper, we