Optimal Design for Multinomial Logit Model with Applications to Best Assortment Identification

ArXi:2605.25592v1 Announce Type: cross We study optimal experimental design for multinomial logit (MNL) bandits, where an agent repeatedly selects a subset of $K$ items from a ground set of size $N$ and observes single-choice feedback. Unlike linear or generalized linear bandits, MNL bandits have a combinatorial action space, which makes classical optimal design approaches and naive optimization over all subsets computationally intractable.