Summary of Gift-sw: Gaussian Noise Injected Fine-tuning Of Salient Weights For Llms, by Maxim Zhelnin et al.
GIFT-SW: Gaussian noise Injected Fine-Tuning of Salient Weights for LLMsby Maxim Zhelnin, Viktor Moskvoretskii, Egor…
GIFT-SW: Gaussian noise Injected Fine-Tuning of Salient Weights for LLMsby Maxim Zhelnin, Viktor Moskvoretskii, Egor…
The Uniqueness of LLaMA3-70B Series with Per-Channel Quantizationby Minghai QinFirst submitted to arxiv on: 27…
Variational autoencoder-based neural network model compressionby Liang Cheng, Peiyuan Guan, Amir Taherkordi, Lei Liu, Dapeng…
Adaptive Resolution Inference (ARI): Energy-Efficient Machine Learning for Internet of Thingsby Ziheng Wang, Pedro Reviriego,…
1-Bit FQT: Pushing the Limit of Fully Quantized Training to 1-bitby Chang Gao, Jianfei Chen,…
Jamba-1.5: Hybrid Transformer-Mamba Models at Scaleby Jamba Team, Barak Lenz, Alan Arazi, Amir Bergman, Avshalom…
Smartphone-based Eye Tracking System using Edge Intelligence and Model Optimisationby Nishan Gunawardena, Gough Yumu Lui,…
Matmul or No Matmul in the Era of 1-bit LLMsby Jinendra Malekar, Mohammed E. Elbtity,…
MARLIN: Mixed-Precision Auto-Regressive Parallel Inference on Large Language Modelsby Elias Frantar, Roberto L. Castro, Jiale…
ABQ-LLM: Arbitrary-Bit Quantized Inference Acceleration for Large Language Modelsby Chao Zeng, Songwei Liu, Yusheng Xie,…