Quantization – Page 45 – GrooveSquid.com

July 13, 2025

GPTVQ: The Blessing of Dimensionality for LLM Quantizationby Mart van Baalen, Andrey Kuzmin, Markus Nagel,…

July 13, 2025

APTQ: Attention-aware Post-Training Mixed-Precision Quantization for Large Language Modelsby Ziyi Guan, Hantao Huang, Yupeng Su,…

July 13, 2025

Distillation Contrastive Decoding: Improving LLMs Reasoning with Contrastive Decoding and Distillationby Phuc Phan, Hieu Tran,…

July 13, 2025

Text me the data: Generating Ground Pressure Sequence from Textual Descriptions for HARby Lala Shakti…

July 13, 2025

Towards a tailored mixed-precision sub-8-bit quantization scheme for Gated Recurrent Units using Genetic Algorithmsby Riccardo…

July 13, 2025

DB-LLM: Accurate Dual-Binarization for Efficient LLMsby Hong Chen, Chengtao Lv, Liang Ding, Haotong Qin, Xiabin…

July 13, 2025

WKVQuant: Quantizing Weight and Key/Value Cache for Large Language Models Gains Moreby Yuxuan Yue, Zhihang…

July 13, 2025

EdgeQAT: Entropy and Distribution Guided Quantization-Aware Training for the Acceleration of Lightweight LLMs on the…

July 13, 2025

Any-Precision LLM: Low-Cost Deployment of Multiple, Different-Sized LLMsby Yeonhong Park, Jake Hyun, SangLyul Cho, Bonggeun…

July 13, 2025

PRISE: LLM-Style Sequence Compression for Learning Temporal Action Abstractions in Controlby Ruijie Zheng, Ching-An Cheng,…