Summary of Moin: Mixture Of Introvert Experts to Upcycle An Llm, by Ajinkya Tejankar et al.
MoIN: Mixture of Introvert Experts to Upcycle an LLMby Ajinkya Tejankar, KL Navaneet, Ujjawal Panchal,…
MoIN: Mixture of Introvert Experts to Upcycle an LLMby Ajinkya Tejankar, KL Navaneet, Ujjawal Panchal,…
GETS: Ensemble Temperature Scaling for Calibration in Graph Neural Networksby Dingyi Zhuang, Chonghe Jiang, Yunhan…
Retraining-Free Merging of Sparse MoE via Hierarchical Clusteringby I-Chun Chen, Hsu-Shen Liu, Wei-Fang Sun, Chen-Hao…
More Experts Than Galaxies: Conditionally-overlapping Experts With Biologically-Inspired Fixed Routingby Sagi Shaier, Francisco Pereira, Katharina…
Upcycling Large Language Models into Mixture of Expertsby Ethan He, Abhinav Khattar, Ryan Prenger, Vijay…
Toward generalizable learning of all (linear) first-order methods via memory augmented Transformersby Sanchayan Dutta, Suvrit…
MoE++: Accelerating Mixture-of-Experts Methods with Zero-Computation Expertsby Peng Jin, Bo Zhu, Li Yuan, Shuicheng YanFirst…
Functional-level Uncertainty Quantification for Calibrated Fine-tuning on LLMsby Ruijia Niu, Dongxia Wu, Rose Yu, Yi-An…
Mixture Compressor for Mixture-of-Experts LLMs Gains Moreby Wei Huang, Yue Liao, Jianhui Liu, Ruifei He,…
Scaling Laws Across Model Architectures: A Comparative Analysis of Dense and MoE Models in Large…