Summary of Restructuring Vector Quantization with the Rotation Trick, by Christopher Fifty et al.
Restructuring Vector Quantization with the Rotation Trickby Christopher Fifty, Ronald G. Junkins, Dennis Duan, Aniketh…
Restructuring Vector Quantization with the Rotation Trickby Christopher Fifty, Ronald G. Junkins, Dennis Duan, Aniketh…
Mixture Compressor for Mixture-of-Experts LLMs Gains Moreby Wei Huang, Yue Liao, Jianhui Liu, Ruifei He,…
QT-DoG: Quantization-aware Training for Domain Generalizationby Saqib Javed, Hieu Le, Mathieu SalzmannFirst submitted to arxiv…
QERA: an Analytical Framework for Quantization Error Reconstructionby Cheng Zhang, Jeffrey T. H. Wong, Can…
Accelerating Error Correction Code Transformersby Matan Levy, Yoni Choukroun, Lior WolfFirst submitted to arxiv on:…
PalmBench: A Comprehensive Benchmark of Compressed Large Language Models on Mobile Platformsby Yilong Li, Jingyu…
PrefixQuant: Eliminating Outliers by Prefixed Tokens for Large Language Models Quantizationby Mengzhao Chen, Yi Liu,…
Resource-aware Mixed-precision Quantization for Enhancing Deployability of Transformers for Time-series Forecasting on Embedded FPGAsby Tianheng…
Mitigating Adversarial Perturbations for Deep Reinforcement Learning via Vector Quantizationby Tung M. Luu, Thanh Nguyen,…
ARB-LLM: Alternating Refined Binarizations for Large Language Modelsby Zhiteng Li, Xianglong Yan, Tianao Zhang, Haotong…