The Rescue Effect: Spatio-Semantic Early Exit Bypasses Quantization Collapse in CLIP

ArXi:2605.26415v1 Announce Type: cross Deploying Vision-Language Models on resource-constrained hardware typically requires INT8 quantization, but in joint-embedding architectures such as CLIP this