MIRA: A Bilingual Benchmark for Medical Information Response Audit

ArXi:2605.28025v1 Announce Type: new Large language models (LLMs) are increasingly used to provide public-facing health information, yet existing safety evaluations overlook whether responses preserve comparable medical information across different user phrasings of the same question. To address this, we