Quantization – Page 49 – GrooveSquid.com

July 13, 2025

Effect of Weight Quantization on Learning Models by Typical Case Analysisby Shuhei Kashiwamura, Ayaka Sakata,…

July 13, 2025

One-Step Forward and Backtrack: Overcoming Zig-Zagging in Loss-Aware Quantization Trainingby Lianbo Ma, Yuee Zhou, Jianlun…

July 13, 2025

Effective Communication with Dynamic Feature Compressionby Pietro Talli, Francesco Pase, Federico Chiariotti, Andrea Zanella, Michele…

July 13, 2025

Residual Quantization with Implicit Neural Codebooksby Iris A. M. Huijben, Matthijs Douze, Matthew Muckley, Ruud…

July 13, 2025

CompactifAI: Extreme Compression of Large Language Models using Quantum-Inspired Tensor Networksby Andrei Tomut, Saeed S.…

July 13, 2025

FP6-LLM: Efficiently Serving Large Language Models Through FP6-Centric Algorithm-System Co-Designby Haojun Xia, Zhen Zheng, Xiaoxia…

July 13, 2025

Value-Driven Mixed-Precision Quantization for Patch-Based Inference on Microcontrollersby Wei Tao, Shenglin He, Kai Lu, Xiaoyang…

July 13, 2025

Scaling Up Quantization-Aware Neural Architecture Search for Efficient Deep Learning on the Edgeby Yao Lu,…

July 13, 2025

Robustness to distribution shifts of compressed networks for edge devicesby Lulan Shen, Ali Edalati, Brett…

July 13, 2025

A2Q+: Improving Accumulator-Aware Weight Quantizationby Ian Colbert, Alessandro Pappalardo, Jakoba Petri-Koenig, Yaman UmurogluFirst submitted to…