Summary of Q-s5: Towards Quantized State Space Models, by Steven Abreu et al.
Q-S5: Towards Quantized State Space Modelsby Steven Abreu, Jens E. Pedersen, Kade M. Heckel, Alessandro…
Q-S5: Towards Quantized State Space Modelsby Steven Abreu, Jens E. Pedersen, Kade M. Heckel, Alessandro…
ME-Switch: A Memory-Efficient Expert Switching Framework for Large Language Modelsby Jing Liu, Ruihao Gong, Mingyang…
Asymptotic Unbiased Sample Sampling to Speed Up Sharpness-Aware Minimizationby Jiaxin Deng, Junbiao Pang, Baochang ZhangFirst…
QuantMoE-Bench: Examining Post-Training Quantization for Mixture-of-Expertsby Pingzhi Li, Xiaolong Jin, Zhen Tan, Yu Cheng, Tianlong…
Image and Video Tokenization with Binary Spherical Quantizationby Yue Zhao, Yuanjun Xiong, Philipp KrähenbühlFirst submitted…
TernaryLLM: Ternarized Large Language Modelby Tianqi Chen, Zhe Li, Weixiang Xu, Zeyu Zhu, Dong Li,…
Low-Rank Quantization-Aware Training for LLMsby Yelysei Bondarenko, Riccardo Del Chiaro, Markus NagelFirst submitted to arxiv…
Efficient Neural Compression with Inference-time Decodingby C. Metz, O. Bichler, A. DupretFirst submitted to arxiv…
Winner-takes-all learners are geometry-aware conditional density estimatorsby Victor Letzelter, David Perera, Cédric Rommel, Mathieu Fontaine,…
QJL: 1-Bit Quantized JL Transform for KV Cache Quantization with Zero Overheadby Amir Zandieh, Majid…