Summary of Magr: Weight Magnitude Reduction For Enhancing Post-training Quantization, by Aozhong Zhang et al.
MagR: Weight Magnitude Reduction for Enhancing Post-Training Quantizationby Aozhong Zhang, Naigang Wang, Yanxia Deng, Xin…
MagR: Weight Magnitude Reduction for Enhancing Post-Training Quantizationby Aozhong Zhang, Naigang Wang, Yanxia Deng, Xin…
Perplexed by Perplexity: Perplexity-Based Data Pruning With Small Reference Modelsby Zachary Ankner, Cody Blakeney, Kartik…
On the Noise Robustness of In-Context Learning for Text Generationby Hongfu Gao, Feipeng Zhang, Wenyu…
SliM-LLM: Salience-Driven Mixed-Precision Quantization for Large Language Modelsby Wei Huang, Haotong Qin, Yangdong Liu, Yawei…
HW-GPT-Bench: Hardware-Aware Architecture Benchmark for Language Modelsby Rhea Sanjay Sukthanker, Arber Zela, Benedikt Staffler, Aaron…
Improving Transformers with Dynamically Composable Multi-Head Attentionby Da Xiao, Qingye Meng, Shengping Li, Xingyuan YuanFirst…
State-Free Inference of State-Space Models: The Transfer Function Approachby Rom N. Parnichkun, Stefano Massaroli, Alessandro…
Lory: Fully Differentiable Mixture-of-Experts for Autoregressive Language Model Pre-trainingby Zexuan Zhong, Mengzhou Xia, Danqi Chen,…
M-DEW: Extending Dynamic Ensemble Weighting to Handle Missing Valuesby Adam Catto, Nan Jia, Ansaf Salleb-Aouissi,…
Benchmarking Benchmark Leakage in Large Language Modelsby Ruijie Xu, Zengzhi Wang, Run-Ze Fan, Pengfei LiuFirst…