Summary of Understanding Expert Structures on Minimax Parameter Estimation in Contaminated Mixture Of Experts, by Fanqi Yan et al.
Understanding Expert Structures on Minimax Parameter Estimation in Contaminated Mixture of Expertsby Fanqi Yan, Huy…
Understanding Expert Structures on Minimax Parameter Estimation in Contaminated Mixture of Expertsby Fanqi Yan, Huy…
MoE-Pruner: Pruning Mixture-of-Experts Large Language Model using the Hints from Its Routerby Yanyue Xie, Zhi…
MoH: Multi-Head Attention as Mixture-of-Head Attentionby Peng Jin, Bo Zhu, Li Yuan, Shuicheng YanFirst submitted…
AT-MoE: Adaptive Task-planning Mixture of Experts via LoRA Approachby Xurui Li, Juanjuan YaoFirst submitted to…
Moirai-MoE: Empowering Time Series Foundation Models with Sparse Mixture of Expertsby Xu Liu, Juncheng Liu,…
Mixture of Experts Made Personalized: Federated Prompt Learning for Vision-Language Modelsby Jun Luo, Chen Chen,…
ContextWIN: Whittle Index Based Mixture-of-Experts Neural Model For Restless Bandits Via Deep RLby Zhanqiu Guo,…
MoIN: Mixture of Introvert Experts to Upcycle an LLMby Ajinkya Tejankar, KL Navaneet, Ujjawal Panchal,…
GETS: Ensemble Temperature Scaling for Calibration in Graph Neural Networksby Dingyi Zhuang, Chonghe Jiang, Yunhan…
Retraining-Free Merging of Sparse MoE via Hierarchical Clusteringby I-Chun Chen, Hsu-Shen Liu, Wei-Fang Sun, Chen-Hao…