Mixture of experts – Page 11 – GrooveSquid.com

Loading Now

July 13, 2025

Summary of Moin: Mixture Of Introvert Experts to Upcycle An Llm, by Ajinkya Tejankar et al.

MoIN: Mixture of Introvert Experts to Upcycle an LLMby Ajinkya Tejankar, KL Navaneet, Ujjawal Panchal,…

July 13, 2025

Summary of Gets: Ensemble Temperature Scaling For Calibration in Graph Neural Networks, by Dingyi Zhuang et al.

GETS: Ensemble Temperature Scaling for Calibration in Graph Neural Networksby Dingyi Zhuang, Chonghe Jiang, Yunhan…

July 13, 2025

Summary of Retraining-free Merging Of Sparse Moe Via Hierarchical Clustering, by I-chun Chen et al.

Retraining-Free Merging of Sparse MoE via Hierarchical Clusteringby I-Chun Chen, Hsu-Shen Liu, Wei-Fang Sun, Chen-Hao…

July 13, 2025

Summary of More Experts Than Galaxies: Conditionally-overlapping Experts with Biologically-inspired Fixed Routing, by Sagi Shaier et al.

More Experts Than Galaxies: Conditionally-overlapping Experts With Biologically-Inspired Fixed Routingby Sagi Shaier, Francisco Pereira, Katharina…

July 13, 2025

Summary of Upcycling Large Language Models Into Mixture Of Experts, by Ethan He et al.

Upcycling Large Language Models into Mixture of Expertsby Ethan He, Abhinav Khattar, Ryan Prenger, Vijay…

July 13, 2025

Summary of Toward Generalizable Learning Of All (linear) First-order Methods Via Memory Augmented Transformers, by Sanchayan Dutta (uc Davis) et al.

Toward generalizable learning of all (linear) first-order methods via memory augmented Transformersby Sanchayan Dutta, Suvrit…

July 13, 2025

Summary of Moe++: Accelerating Mixture-of-experts Methods with Zero-computation Experts, by Peng Jin et al.

MoE++: Accelerating Mixture-of-Experts Methods with Zero-Computation Expertsby Peng Jin, Bo Zhu, Li Yuan, Shuicheng YanFirst…

July 13, 2025

Summary of Functional-level Uncertainty Quantification For Calibrated Fine-tuning on Llms, by Ruijia Niu et al.

Functional-level Uncertainty Quantification for Calibrated Fine-tuning on LLMsby Ruijia Niu, Dongxia Wu, Rose Yu, Yi-An…

July 13, 2025

Summary of Mixture Compressor For Mixture-of-experts Llms Gains More, by Wei Huang et al.

Mixture Compressor for Mixture-of-Experts LLMs Gains Moreby Wei Huang, Yue Liao, Jianhui Liu, Ruifei He,…

July 13, 2025

Summary of Scaling Laws Across Model Architectures: a Comparative Analysis Of Dense and Moe Models in Large Language Models, by Siqi Wang et al.

Scaling Laws Across Model Architectures: A Comparative Analysis of Dense and MoE Models in Large…