AI RESEARCH
Do No Harm? Hallucination and Actor-Level Abuse in Web-Deployed Medical Large Language Models
arXiv CS.CL
•
ArXi:2605.20591v1 Announce Type: new Medical large language models (LLMs), including custom medical GPTs (MedGPTs) and open-source models, are increasingly deployed on web platforms to provide clinical guidance. However, they pose risks of hallucination, policy noncompliance, and unsafe design. We conduct a large-scale assessment of 6,233 MedGPTs, evaluating a stratified sample of 1,500, together with 10 open-source LLMs. We