Same Patient, Different Words, Different Diagnosis? Evaluating Semantic Stability in Clinical LLMs

ArXi:2605.30646v1 Announce Type: cross Large Language Models (LLMs) are increasingly used in clinical applications. However, their behavior remains highly sensitive to subtle linguistic variations, such as rephrasing or syntactic variation. This sensitivity poses risks in safety-critical healthcare settings, where semantically equivalent inputs should produce consistent predictions.