Mixture of experts – Page 10 – GrooveSquid.com

Loading Now

July 13, 2025

Summary of Momentumsmoe: Integrating Momentum Into Sparse Mixture Of Experts, by Rachel S.y. Teo et al.

MomentumSMoE: Integrating Momentum into Sparse Mixture of Expertsby Rachel S.Y. Teo, Tan M. NguyenFirst submitted…

July 13, 2025

Summary of St-moe-bert: a Spatial-temporal Mixture-of-experts Framework For Long-term Cross-city Mobility Prediction, by Haoyu He et al.

ST-MoE-BERT: A Spatial-Temporal Mixture-of-Experts Framework for Long-Term Cross-City Mobility Predictionby Haoyu He, Haozheng Luo, Qi…

July 13, 2025

Summary of Enhancing Generalization in Sparse Mixture Of Experts Models: the Case For Increased Expert Activation in Compositional Tasks, by Jinze Zhao

Enhancing Generalization in Sparse Mixture of Experts Models: The Case for Increased Expert Activation in…

July 13, 2025

Summary of Understanding Expert Structures on Minimax Parameter Estimation in Contaminated Mixture Of Experts, by Fanqi Yan et al.

Understanding Expert Structures on Minimax Parameter Estimation in Contaminated Mixture of Expertsby Fanqi Yan, Huy…

July 13, 2025

Summary of Moe-pruner: Pruning Mixture-of-experts Large Language Model Using the Hints From Its Router, by Yanyue Xie et al.

MoE-Pruner: Pruning Mixture-of-Experts Large Language Model using the Hints from Its Routerby Yanyue Xie, Zhi…

July 13, 2025

Summary of Moh: Multi-head Attention As Mixture-of-head Attention, by Peng Jin et al.

MoH: Multi-Head Attention as Mixture-of-Head Attentionby Peng Jin, Bo Zhu, Li Yuan, Shuicheng YanFirst submitted…

July 13, 2025

Summary of At-moe: Adaptive Task-planning Mixture Of Experts Via Lora Approach, by Xurui Li et al.

AT-MoE: Adaptive Task-planning Mixture of Experts via LoRA Approachby Xurui Li, Juanjuan YaoFirst submitted to…

July 13, 2025

Summary of Moirai-moe: Empowering Time Series Foundation Models with Sparse Mixture Of Experts, by Xu Liu et al.

Moirai-MoE: Empowering Time Series Foundation Models with Sparse Mixture of Expertsby Xu Liu, Juncheng Liu,…

July 13, 2025

Summary of Mixture Of Experts Made Personalized: Federated Prompt Learning For Vision-language Models, by Jun Luo et al.

Mixture of Experts Made Personalized: Federated Prompt Learning for Vision-Language Modelsby Jun Luo, Chen Chen,…

July 13, 2025

Summary of Contextwin: Whittle Index Based Mixture-of-experts Neural Model For Restless Bandits Via Deep Rl, by Zhanqiu Guo et al.

ContextWIN: Whittle Index Based Mixture-of-Experts Neural Model For Restless Bandits Via Deep RLby Zhanqiu Guo,…