Functional Entropy: Predicting Functional Correctness in LLM-Generated Code with Uncertainty Quantification

ArXi:2605.28500v1 Announce Type: cross Large language models have shown impressive capabilities in code generation, yet they often produce functionally incorrect code. Uncertainty quantification (UQ) methods have emerged as a promising approach for detecting hallucinations in natural language generation, but their effectiveness for code generation tasks remains underexplored. We systematically evaluate how UQ techniques transfer to code generation across three programming languages, five LLMs, and over 1,700 problems.