AI RESEARCH
ProbMoE: Differentiable Probabilistic Routing for Mixture-of-Experts
arXiv CS.AI
•
ArXi:2606.01509v1 Announce Type: cross Mixture-of-Experts (MoE) models scale by activating only a small subset of experts per token. However