AI RESEARCH
Internal Representation, Not Clinical Knowledge: Where Apparent LLM Triage Failures Originate
arXiv CS.AI
•
ArXi:2605.29889v1 Announce Type: cross Patient-voiced clinical-triage benchmarks report high under-triage rates for consumer LLMs for constrained multiple-choice output, yet the same cases score differently with free-text. We ask whether output format changes the model's \emph{clinical representation} or only the mapping from a preserved representation to an answer.