Quantization – Page 33 – GrooveSquid.com

Loading Now

July 13, 2025

Summary of Tender: Accelerating Large Language Models Via Tensor Decomposition and Runtime Requantization, by Jungi Lee et al.

Tender: Accelerating Large Language Models via Tensor Decomposition and Runtime Requantizationby Jungi Lee, Wonbeom Lee,…

July 13, 2025

Summary of Mixture Of Scales: Memory-efficient Token-adaptive Binarization For Large Language Models, by Dongwon Jo et al.

Mixture of Scales: Memory-Efficient Token-Adaptive Binarization for Large Language Modelsby Dongwon Jo, Taesu Kim, Yulhwa…

July 13, 2025

Summary of Prefixing Attention Sinks Can Mitigate Activation Outliers For Large Language Model Quantization, by Seungwoo Son et al.

Prefixing Attention Sinks can Mitigate Activation Outliers for Large Language Model Quantizationby Seungwoo Son, Wonpyo…

July 13, 2025

Summary of Excp: Extreme Llm Checkpoint Compression Via Weight-momentum Joint Shrinking, by Wenshuo Li et al.

ExCP: Extreme LLM Checkpoint Compression via Weight-Momentum Joint Shrinkingby Wenshuo Li, Xinghao Chen, Han Shu,…

July 13, 2025

Summary of Qtip: Quantization with Trellises and Incoherence Processing, by Albert Tseng et al.

QTIP: Quantization with Trellises and Incoherence Processingby Albert Tseng, Qingyao Sun, David Hou, Christopher De…

July 13, 2025

Summary of Promoting Data and Model Privacy in Federated Learning Through Quantized Lora, by Jianhao Zhu et al.

Promoting Data and Model Privacy in Federated Learning through Quantized LoRAby JianHao Zhu, Changze Lv,…

July 13, 2025

Summary of Memory Faults in Activation-sparse Quantized Deep Neural Networks: Analysis and Mitigation Using Sharpness-aware Training, by Akul Malhotra et al.

Memory Faults in Activation-sparse Quantized Deep Neural Networks: Analysis and Mitigation using Sharpness-aware Trainingby Akul…

July 13, 2025

Summary of Mobileaibench: Benchmarking Llms and Lmms For On-device Use Cases, by Rithesh Murthy et al.

MobileAIBench: Benchmarking LLMs and LMMs for On-Device Use Casesby Rithesh Murthy, Liangwei Yang, Juntao Tan,…

July 13, 2025

Summary of Precipitation Nowcasting Using Physics Informed Discriminator Generative Models, by Junzhe Yin et al.

Precipitation Nowcasting Using Physics Informed Discriminator Generative Modelsby Junzhe Yin, Cristian Meo, Ankush Roy, Zeineh…

July 13, 2025

Summary of Qqq: Quality Quattuor-bit Quantization For Large Language Models, by Ying Zhang et al.

QQQ: Quality Quattuor-Bit Quantization for Large Language Modelsby Ying Zhang, Peng Zhang, Mincong Huang, Jingyang…