OVA-IB: One vs All Information Bottleneck for Multi-Modal Alignment

ArXi:2605.29900v1 Announce Type: new Contrastive learning is effective for aligning paired views or modalities, but alignment beyond two modalities remains non-trivial and comparatively underexplored. Pairwise CLIP-style losses decompose multi-modal alignment into independent two-way comparisons and therefore do not explicitly model higher-order dependencies among multiple modalities.