Summary of Dynamic Mixture Of Experts: An Auto-tuning Approach For Efficient Transformer Models, by Yongxin Guo et al.
Dynamic Mixture of Experts: An Auto-Tuning Approach for Efficient Transformer Modelsby Yongxin Guo, Zhenglin Cheng,…
Dynamic Mixture of Experts: An Auto-Tuning Approach for Efficient Transformer Modelsby Yongxin Guo, Zhenglin Cheng,…
Mixture of Experts Meets Prompt-Based Continual Learningby Minh Le, An Nguyen, Huy Nguyen, Trang Nguyen,…
Statistical Advantages of Perturbing Cosine Router in Mixture of Expertsby Huy Nguyen, Pedram Akbarian, Trang…
DirectMultiStep: Direct Route Generation for Multi-Step Retrosynthesisby Yu Shee, Haote Li, Anton Morgunov, Victor BatistaFirst…
Sigmoid Gating is More Sample Efficient than Softmax Gating in Mixture of Expertsby Huy Nguyen,…
Ensemble and Mixture-of-Experts DeepONets For Operator Learningby Ramansh Sharma, Varun ShankarFirst submitted to arxiv on:…
Learning More Generalized Experts by Merging Experts in Mixture-of-Expertsby Sejik ParkFirst submitted to arxiv on:…
Lory: Fully Differentiable Mixture-of-Experts for Autoregressive Language Model Pre-trainingby Zexuan Zhong, Mengzhou Xia, Danqi Chen,…
Hierarchical mixture of discriminative Generalized Dirichlet classifiersby Elvis Togban, Djemel ZiouFirst submitted to arxiv on:…
MVMoE: Multi-Task Vehicle Routing Solver with Mixture-of-Expertsby Jianan Zhou, Zhiguang Cao, Yaoxin Wu, Wen Song,…