Summary of Towards An Empirical Understanding Of Moe Design Choices, by Dongyang Fan et al.
Towards an empirical understanding of MoE design choicesby Dongyang Fan, Bettina Messmer, Martin JaggiFirst submitted…
Towards an empirical understanding of MoE design choicesby Dongyang Fan, Bettina Messmer, Martin JaggiFirst submitted…
Emulated Disalignment: Safety Alignment for Large Language Models May Backfire!by Zhanhui Zhou, Jie Liu, Zhichen…
Generation Meets Verification: Accelerating Large Language Model Inference with Smart Parallel Auto-Correct Decodingby Hanling Yi,…
MARS: Meaning-Aware Response Scoring for Uncertainty Estimation in Generative LLMsby Yavuz Faruk Bakman, Duygu Nur…
The Evolution of Statistical Induction Heads: In-Context Learning Markov Chainsby Benjamin L. Edelman, Ezra Edelman,…
EdgeQAT: Entropy and Distribution Guided Quantization-Aware Training for the Acceleration of Lightweight LLMs on the…
Bidirectional Generative Pre-training for Improving Healthcare Time-series Representation Learningby Ziyang Song, Qincheng Lu, He Zhu,…
Graph Mamba: Towards Learning on Graphs with State Space Modelsby Ali Behrouz, Farnoosh HashemiFirst submitted…
LLaGA: Large Language and Graph Assistantby Runjin Chen, Tong Zhao, Ajay Jaiswal, Neil Shah, Zhangyang…
Autonomous Data Selection with Zero-shot Generative Classifiers for Mathematical Textsby Yifan Zhang, Yifan Luo, Yang…