Quantization – Page 2 – GrooveSquid.com

July 13, 2025

Summary of An Exploration Of the Effect Of Quantisation on Energy Consumption and Inference Time Of Starcoder2, by Pepijn De Reus et al.

An exploration of the effect of quantisation on energy consumption and inference time of StarCoder2by…

July 13, 2025

Summary of Amxfp4: Taming Activation Outliers with Asymmetric Microscaling Floating-point For 4-bit Llm Inference, by Janghwan Lee et al.

AMXFP4: Taming Activation Outliers with Asymmetric Microscaling Floating-Point for 4-bit LLM Inferenceby Janghwan Lee, Jiwoong…

July 13, 2025

Summary of The Super Weight in Large Language Models, by Mengxia Yu et al.

The Super Weight in Large Language Modelsby Mengxia Yu, De Wang, Qi Shan, Colorado Reed,…

July 13, 2025

Summary of Qwen2.5-32b: Leveraging Self-consistent Tool-integrated Reasoning For Bengali Mathematical Olympiad Problem Solving, by Saad Tahmid and Sourav Sarker

Qwen2.5-32B: Leveraging Self-Consistent Tool-Integrated Reasoning for Bengali Mathematical Olympiad Problem Solvingby Saad Tahmid, Sourav SarkerFirst…

July 13, 2025

Summary of Aligned Vector Quantization For Edge-cloud Collabrative Vision-language Models, by Xiao Liu et al.

Aligned Vector Quantization for Edge-Cloud Collabrative Vision-Language Modelsby Xiao Liu, Lijun Zhang, Deepak Ganesan, Hui…

July 13, 2025

Summary of Eora: Training-free Compensation For Compressed Llm with Eigenspace Low-rank Approximation, by Shih-yang Liu et al.

EoRA: Training-free Compensation for Compressed LLM with Eigenspace Low-Rank Approximationby Shih-Yang Liu, Maksim Khadkevich, Nai…

July 13, 2025

Summary of An Exploration Of the Effect Of Quantisation on Energy Consumption and Inference Time Of Starcoder2, by Pepijn De Reus et al.

Summary of Amxfp4: Taming Activation Outliers with Asymmetric Microscaling Floating-point For 4-bit Llm Inference, by Janghwan Lee et al.

Summary of The Super Weight in Large Language Models, by Mengxia Yu et al.

Summary of Qwen2.5-32b: Leveraging Self-consistent Tool-integrated Reasoning For Bengali Mathematical Olympiad Problem Solving, by Saad Tahmid and Sourav Sarker

Summary of Aligned Vector Quantization For Edge-cloud Collabrative Vision-language Models, by Xiao Liu et al.

Summary of Eora: Training-free Compensation For Compressed Llm with Eigenspace Low-rank Approximation, by Shih-yang Liu et al.

Summary of A Counterexample in Cross-correlation Template Matching, by Serap A. Savari

Summary of Catastrophic Failure Of Llm Unlearning Via Quantization, by Zhiwei Zhang et al.

Summary of Lossless Kv Cache Compression to 2%, by Zhen Yang et al.

Summary of Channel-wise Mixed-precision Quantization For Large Language Models, by Zihan Chen et al.