AI RESEARCH

Bandit Convex Optimization with Gradient Prediction Adaptivity

arXiv CS.LG

ArXi:2605.22191v1 Announce Type: new Bandit convex optimization (BCO) is a fundamental online learning framework with partial feedback, where the learner observes only the loss incurred at the chosen decision point in each round. In this work, we investigate whether optimistic gradient predictions can improve worst-case regret guarantees in a prediction-adaptive manner.