Summary of Mamba-ptq: Outlier Channels in Recurrent Large Language Models, by Alessandro Pierro et al.
Mamba-PTQ: Outlier Channels in Recurrent Large Language Modelsby Alessandro Pierro, Steven AbreuFirst submitted to arxiv…
Mamba-PTQ: Outlier Channels in Recurrent Large Language Modelsby Alessandro Pierro, Steven AbreuFirst submitted to arxiv…
Spectra: Surprising Effectiveness of Pretraining Ternary Language Models at Scaleby Ayush Kaushal, Tejas Vaidhya, Arnab…
Co-Designing Binarized Transformer and Hardware Accelerator for Efficient End-to-End Edge Deploymentby Yuhao Ji, Chao Fang,…
Tiled Bit Networks: Sub-Bit Neural Network Compression Through Reuse of Learnable Binary Vectorsby Matt Gorbett,…
Exploring Quantization for Efficient Pre-Training of Transformer Language Modelsby Kamran Chitsaz, Quentin Fournier, Gonçalo Mordido,…
LRQ: Optimizing Post-Training Quantization for Large Language Models by Learning Low-Rank Weight-Scaling Matricesby Jung Hyun…
EfficientQAT: Efficient Quantization-Aware Training for Large Language Modelsby Mengzhao Chen, Wenqi Shao, Peng Xu, Jiahao…
ISQuant: apply squant to the real deploymentby Dezan ZhaoFirst submitted to arxiv on: 5 Jul…
Integer-only Quantized Transformers for Embedded FPGA-based Time-series Forecasting in AIoTby Tianheng Ling, Chao Qian, Gregor…
Fast Matrix Multiplications for Lookup Table-Quantized LLMsby Han Guo, William Brandon, Radostin Cholakov, Jonathan Ragan-Kelley,…