Quantization – Page 26 – GrooveSquid.com

Loading Now

July 13, 2025

Summary of Gift-sw: Gaussian Noise Injected Fine-tuning Of Salient Weights For Llms, by Maxim Zhelnin et al.

GIFT-SW: Gaussian noise Injected Fine-Tuning of Salient Weights for LLMsby Maxim Zhelnin, Viktor Moskvoretskii, Egor…

July 13, 2025

Summary of The Uniqueness Of Llama3-70b Series with Per-channel Quantization, by Minghai Qin

The Uniqueness of LLaMA3-70B Series with Per-Channel Quantizationby Minghai QinFirst submitted to arxiv on: 27…

July 13, 2025

Summary of Variational Autoencoder-based Neural Network Model Compression, by Liang Cheng et al.

Variational autoencoder-based neural network model compressionby Liang Cheng, Peiyuan Guan, Amir Taherkordi, Lei Liu, Dapeng…

July 13, 2025

Summary of Adaptive Resolution Inference (ari): Energy-efficient Machine Learning For Internet Of Things, by Ziheng Wang et al.

Adaptive Resolution Inference (ARI): Energy-Efficient Machine Learning for Internet of Thingsby Ziheng Wang, Pedro Reviriego,…

July 13, 2025

Summary of 1-bit Fqt: Pushing the Limit Of Fully Quantized Training to 1-bit, by Chang Gao et al.

1-Bit FQT: Pushing the Limit of Fully Quantized Training to 1-bitby Chang Gao, Jianfei Chen,…

July 13, 2025

Summary of Jamba-1.5: Hybrid Transformer-mamba Models at Scale, by Jamba Team: Barak Lenz et al.

Jamba-1.5: Hybrid Transformer-Mamba Models at Scaleby Jamba Team, Barak Lenz, Alan Arazi, Amir Bergman, Avshalom…

July 13, 2025

Summary of Smartphone-based Eye Tracking System Using Edge Intelligence and Model Optimisation, by Nishan Gunawardena et al.

Smartphone-based Eye Tracking System using Edge Intelligence and Model Optimisationby Nishan Gunawardena, Gough Yumu Lui,…

July 13, 2025

Summary of Matmul or No Matmul in the Era Of 1-bit Llms, by Jinendra Malekar et al.

Matmul or No Matmul in the Era of 1-bit LLMsby Jinendra Malekar, Mohammed E. Elbtity,…

July 13, 2025

Summary of Marlin: Mixed-precision Auto-regressive Parallel Inference on Large Language Models, by Elias Frantar et al.

MARLIN: Mixed-Precision Auto-Regressive Parallel Inference on Large Language Modelsby Elias Frantar, Roberto L. Castro, Jiale…

July 13, 2025

Summary of Abq-llm: Arbitrary-bit Quantized Inference Acceleration For Large Language Models, by Chao Zeng et al.

ABQ-LLM: Arbitrary-Bit Quantized Inference Acceleration for Large Language Modelsby Chao Zeng, Songwei Liu, Yusheng Xie,…