The Multilingual Curse at the Retrieval Layer: Evidence from Amharic

ArXi:2605.24556v1 Announce Type: cross Multilingual retrieval increasingly underpins cross-lingual question answering and retrieval-augmented generation. Strong zero-shot scores on multilingual benchmarks are often taken as evidence that current encoders transfer reliably across many languages. We argue that this assumption breaks down for underrepresented, morphologically rich languages, and use Amharic as a diagnostic case.