AI RESEARCH

When Mean CE Fails: Median CE Can Better Track Language Model Quality

arXiv CS.AI • May 26, 2026

ArXi:2605.24667v1 Announce Type: new Mean cross-entropy is the standard validation metric for language models, but it can fail to track model quality during

Read Full Article