AI RESEARCH

ProbMoE: Differentiable Probabilistic Routing for Mixture-of-Experts

arXiv CS.AI

ArXi:2606.01509v1 Announce Type: cross Mixture-of-Experts (MoE) models scale by activating only a small subset of experts per token. However