AI RESEARCH
Bandit Convex Optimization with Gradient Prediction Adaptivity
arXiv CS.LG
•
ArXi:2605.22191v1 Announce Type: new Bandit convex optimization (BCO) is a fundamental online learning framework with partial feedback, where the learner observes only the loss incurred at the chosen decision point in each round. In this work, we investigate whether optimistic gradient predictions can improve worst-case regret guarantees in a prediction-adaptive manner.