Summary of Quamba: a Post-training Quantization Recipe For Selective State Space Models, by Hung-yueh Chiang et al.
Quamba: A Post-Training Quantization Recipe for Selective State Space Modelsby Hung-Yueh Chiang, Chi-Chih Chang, Natalia…
Quamba: A Post-Training Quantization Recipe for Selective State Space Modelsby Hung-Yueh Chiang, Chi-Chih Chang, Natalia…
AsymKV: Enabling 1-Bit Quantization of KV Cache with Layer-Wise Asymmetric Quantization Configurationsby Qian Tao, Wenyuan…
DAQ: Density-Aware Post-Training Weight-Only Quantization For LLMsby Yingsong Luo, Ling ChenFirst submitted to arxiv on:…
Scaling Laws for Post Training Quantized Large Language Modelsby Zifei Xu, Alexander Lan, Wanzin Yazar,…
Efficiera Residual Networks: Hardware-Friendly Fully Binary Weight with 2-bit Activation Model Achieves Practical ImageNet Accuracyby…
QSpec: Speculative Decoding with Complementary Quantization Schemesby Juntao Zhao, Wenhao Lu, Sheng Wang, Lingpeng Kong,…
Error Diffusion: Post Training Quantization with Block-Scaled Number Formats for Neural Networksby Alireza Khodamoradi, Kristof…
Continuous Approximations for Improving Quantization Aware Training of LLMsby He Li, Jianhang Hong, Yuanzhuo Wu,…
When Attention Sink Emerges in Language Models: An Empirical Viewby Xiangming Gu, Tianyu Pang, Chao…
SLaNC: Static LayerNorm Calibrationby Mahsa Salmani, Nikita Trukhanov, Ilya SoloveychikFirst submitted to arxiv on: 14…