Summary of Gptvq: the Blessing Of Dimensionality For Llm Quantization, by Mart Van Baalen et al.
GPTVQ: The Blessing of Dimensionality for LLM Quantizationby Mart van Baalen, Andrey Kuzmin, Markus Nagel,…
GPTVQ: The Blessing of Dimensionality for LLM Quantizationby Mart van Baalen, Andrey Kuzmin, Markus Nagel,…
APTQ: Attention-aware Post-Training Mixed-Precision Quantization for Large Language Modelsby Ziyi Guan, Hantao Huang, Yupeng Su,…
Distillation Contrastive Decoding: Improving LLMs Reasoning with Contrastive Decoding and Distillationby Phuc Phan, Hieu Tran,…
Text me the data: Generating Ground Pressure Sequence from Textual Descriptions for HARby Lala Shakti…
Towards a tailored mixed-precision sub-8-bit quantization scheme for Gated Recurrent Units using Genetic Algorithmsby Riccardo…
DB-LLM: Accurate Dual-Binarization for Efficient LLMsby Hong Chen, Chengtao Lv, Liang Ding, Haotong Qin, Xiabin…
WKVQuant: Quantizing Weight and Key/Value Cache for Large Language Models Gains Moreby Yuxuan Yue, Zhihang…
EdgeQAT: Entropy and Distribution Guided Quantization-Aware Training for the Acceleration of Lightweight LLMs on the…
Any-Precision LLM: Low-Cost Deployment of Multiple, Different-Sized LLMsby Yeonhong Park, Jake Hyun, SangLyul Cho, Bonggeun…
PRISE: LLM-Style Sequence Compression for Learning Temporal Action Abstractions in Controlby Ruijie Zheng, Ching-An Cheng,…