Smoothed Elicitation Complexity for Approximate $\Gamma$-calibration of Discrete Classification Tasks

ArXi:2605.23017v1 Announce Type: new One prominent method of evaluating machine learning model trustworthiness is the notion of calibration. In the binary outcome setting, a probabilistic predictor is calibrated if outcomes are realized according to a model's distributional prediction, conditioned on this prediction. Straightforward extensions of binary calibration definitions to probabilistic multiclass classifiers suffer from an exponential complexity blowup as the space of predictions grows exponentially in the number of classes $n.