Summary of Affinequant: Affine Transformation Quantization For Large Language Models, by Yuexiao Ma et al.
AffineQuant: Affine Transformation Quantization for Large Language Modelsby Yuexiao Ma, Huixia Li, Xiawu Zheng, Feng…
AffineQuant: Affine Transformation Quantization for Large Language Modelsby Yuexiao Ma, Huixia Li, Xiawu Zheng, Feng…
MYTE: Morphology-Driven Byte Encoding for Better and Fairer Multilingual Language Modelingby Tomasz Limisiewicz, Terra Blevins,…
Quiet-STaR: Language Models Can Teach Themselves to Think Before Speakingby Eric Zelikman, Georges Harik, Yijia…
Language models scale reliably with over-training and on downstream tasksby Samir Yitzhak Gadre, Georgios Smyrnis,…
Benchmarking zero-shot stance detection with FlanT5-XXL: Insights from training data, prompting, and decoding strategies into…
Simple linear attention language models balance the recall-throughput tradeoffby Simran Arora, Sabri Eyuboglu, Michael Zhang,…
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bitsby Shuming Ma,…
Personalized Federated Instruction Tuning via Neural Architecture Searchby Pengyu Zhang, Yingbo Zhou, Ming Hu, Junxian…
APTQ: Attention-aware Post-Training Mixed-Precision Quantization for Large Language Modelsby Ziyi Guan, Hantao Huang, Yupeng Su,…
Improving Language Understanding from Screenshotsby Tianyu Gao, Zirui Wang, Adithya Bhaskar, Danqi ChenFirst submitted to…