Near-Optimal Regret in Adversarial Kernel Bandits

ArXi:2605.26585v1 Announce Type: new We study the adversarial kernel bandit problem, in which the loss at each round is induced by an arbitrary bounded element of a reproducing kernel Hilbert space (RKHS). We propose an exponential-weights algorithm built on a regularized importance-weighted loss estimator, together with an explicit correction term that cancels the bias