AI RESEARCH
MIRA: A Bilingual Benchmark for Medical Information Response Audit
arXiv CS.AI
•
ArXi:2605.28025v1 Announce Type: new Large language models (LLMs) are increasingly used to provide public-facing health information, yet existing safety evaluations overlook whether responses preserve comparable medical information across different user phrasings of the same question. To address this, we