AI RESEARCH
Nonstationary Generalized Linear Bandits with Discounted Online Mirror Descent
arXiv CS.LG
•
We study nonstationary generalized linear bandits (GLBs), where the expected reward is modeled through a nonlinear link function with an unknown time-varying parameter. Existing approaches are predominantly based on maximum-likelihood estimation (MLE), using sliding-window, restart, or discounting mechanisms to handle nonstationarity. Although these methods achieve statistically efficien