Summary of Accumulator-aware Post-training Quantization, by Ian Colbert et al.
Accumulator-Aware Post-Training Quantizationby Ian Colbert, Fabian Grob, Giuseppe Franco, Jinjie Zhang, Rayan SaabFirst submitted to…
Accumulator-Aware Post-Training Quantizationby Ian Colbert, Fabian Grob, Giuseppe Franco, Jinjie Zhang, Rayan SaabFirst submitted to…
AlignedKV: Reducing Memory Access of KV-Cache with Precision-Aligned Quantizationby Yifan Tan, Haoze Wang, Chao Yan,…
A Survey of Low-bit Large Language Models: Basics, Systems, and Algorithmsby Ruihao Gong, Yifu Ding,…
Communication and Energy Efficient Federated Learning using Zero-Order Optimization Techniqueby Elissa Mhanna, Mohamad AssaadFirst submitted…
Disentanglement with Factor Quantized Variational Autoencodersby Gulcin Baykal, Melih Kandemir, Gozde UnalFirst submitted to arxiv…
CorBin-FL: A Differentially Private Federated Learning Mechanism using Common Randomnessby Hojat Allah Salehi, Md Jueal…
Bilateral Sharpness-Aware Minimization for Flatter Minimaby Jiaxin Deng, Junbiao Pang, Baochang Zhang, Qingming HuangFirst submitted…
Scaling FP8 training to trillion-token LLMsby Maxim Fishman, Brian Chmiel, Ron Banner, Daniel SoudryFirst submitted…
Pareto Data Framework: Steps Towards Resource-Efficient Decision Making Using Minimum Viable Data (MVD)by Tashfain Ahmed,…
Art and Science of Quantizing Large-Scale Models: A Comprehensive Overviewby Yanshu Wang, Tong Yang, Xiyan…