Summary of Resq: Mixed-precision Quantization Of Large Language Models with Low-rank Residuals, by Utkarsh Saxena et al.
ResQ: Mixed-Precision Quantization of Large Language Models with Low-Rank Residualsby Utkarsh Saxena, Sayeh Sharify, Kaushik…
ResQ: Mixed-Precision Quantization of Large Language Models with Low-Rank Residualsby Utkarsh Saxena, Sayeh Sharify, Kaushik…
VidTok: A Versatile and Open-Source Video Tokenizerby Anni Tang, Tianyu He, Junliang Guo, Xinle Cheng,…
Apollo-Forecast: Overcoming Aliasing and Inference Speed Challenges in Language Models for Time Series Forecastingby Tianyi…
Fast and Slow Gradient Approximation for Binary Neural Network Optimizationby Xinquan Chen, Junqi Gao, Biqing…
QPruner: Probabilistic Decision Quantization for Structured Pruning in Large Language Modelsby Changhai Zhou, Yuhua Zhou,…
FinLoRA: Finetuning Quantized Financial Large Language Models Using Low-Rank Adaptationby Dannong Wang, Daniel Kim, Bo…
Progressive Compression with Universally Quantized Diffusion Modelsby Yibo Yang, Justus C. Will, Stephan MandtFirst submitted…
Adaptive Quantization Resolution and Power Control for Federated Learning over Cell-free Networksby Afsaneh Mahmoudi, Emil…
Memory-Efficient 4-bit Preconditioned Stochastic Optimizationby Jingyang Li, Kuangyu Ding, Kim-Chuan Toh, Pan ZhouFirst submitted to…
Efficient Generative Modeling with Residual Vector Quantization-Based Tokensby Jaehyeon Kim, Taehong Moon, Keon Lee, Jaewoong…