AI RESEARCH

The Latin Substrate: How Language Models Represent and Mediate Script Choice

arXiv CS.CL

ArXi:2605.31363v1 Announce Type: new Many languages are written in multiple scripts, requiring large language models (LLMs) to generate equivalent linguistic content in distinct orthographic forms. While prior work suggests that LLMs route information through shared latent representations, how they internally mediate script variation remains poorly understood.