AI RESEARCH

Internal Representation, Not Clinical Knowledge: Where Apparent LLM Triage Failures Originate

arXiv CS.AI

ArXi:2605.29889v1 Announce Type: cross Patient-voiced clinical-triage benchmarks report high under-triage rates for consumer LLMs for constrained multiple-choice output, yet the same cases score differently with free-text. We ask whether output format changes the model's \emph{clinical representation} or only the mapping from a preserved representation to an answer.