Summary of Effect Of Weight Quantization on Learning Models by Typical Case Analysis, By Shuhei Kashiwamura et al.
Effect of Weight Quantization on Learning Models by Typical Case Analysisby Shuhei Kashiwamura, Ayaka Sakata,…
Effect of Weight Quantization on Learning Models by Typical Case Analysisby Shuhei Kashiwamura, Ayaka Sakata,…
One-Step Forward and Backtrack: Overcoming Zig-Zagging in Loss-Aware Quantization Trainingby Lianbo Ma, Yuee Zhou, Jianlun…
Effective Communication with Dynamic Feature Compressionby Pietro Talli, Francesco Pase, Federico Chiariotti, Andrea Zanella, Michele…
Residual Quantization with Implicit Neural Codebooksby Iris A. M. Huijben, Matthijs Douze, Matthew Muckley, Ruud…
CompactifAI: Extreme Compression of Large Language Models using Quantum-Inspired Tensor Networksby Andrei Tomut, Saeed S.…
FP6-LLM: Efficiently Serving Large Language Models Through FP6-Centric Algorithm-System Co-Designby Haojun Xia, Zhen Zheng, Xiaoxia…
Value-Driven Mixed-Precision Quantization for Patch-Based Inference on Microcontrollersby Wei Tao, Shenglin He, Kai Lu, Xiaoyang…
Scaling Up Quantization-Aware Neural Architecture Search for Efficient Deep Learning on the Edgeby Yao Lu,…
Robustness to distribution shifts of compressed networks for edge devicesby Lulan Shen, Ali Edalati, Brett…
A2Q+: Improving Accumulator-Aware Weight Quantizationby Ian Colbert, Alessandro Pappalardo, Jakoba Petri-Koenig, Yaman UmurogluFirst submitted to…