Perplexity – Page 16 – GrooveSquid.com

Loading Now

July 13, 2025

Summary of Affinequant: Affine Transformation Quantization For Large Language Models, by Yuexiao Ma et al.

AffineQuant: Affine Transformation Quantization for Large Language Modelsby Yuexiao Ma, Huixia Li, Xiawu Zheng, Feng…

July 13, 2025

Summary of Myte: Morphology-driven Byte Encoding For Better and Fairer Multilingual Language Modeling, by Tomasz Limisiewicz and Terra Blevins and Hila Gonen and Orevaoghene Ahia and Luke Zettlemoyer

MYTE: Morphology-Driven Byte Encoding for Better and Fairer Multilingual Language Modelingby Tomasz Limisiewicz, Terra Blevins,…

July 13, 2025

Summary of Quiet-star: Language Models Can Teach Themselves to Think Before Speaking, by Eric Zelikman et al.

Quiet-STaR: Language Models Can Teach Themselves to Think Before Speakingby Eric Zelikman, Georges Harik, Yijia…

July 13, 2025

Summary of Language Models Scale Reliably with Over-training and on Downstream Tasks, by Samir Yitzhak Gadre and Georgios Smyrnis and Vaishaal Shankar and Suchin Gururangan and Mitchell Wortsman and Rulin Shao and Jean Mercat and Alex Fang and Jeffrey Li and Sedrick Keh and Rui Xin and Marianna Nezhurina and Igor Vasiljevic and Jenia Jitsev and Luca Soldaini and Alexandros G. Dimakis and Gabriel Ilharco and Pang Wei Koh and Shuran Song and Thomas Kollar and Yair Carmon and Achal Dave and Reinhard Heckel and Niklas Muennighoff and Ludwig Schmidt

Language models scale reliably with over-training and on downstream tasksby Samir Yitzhak Gadre, Georgios Smyrnis,…

July 13, 2025

Summary of Benchmarking Zero-shot Stance Detection with Flant5-xxl: Insights From Training Data, Prompting, and Decoding Strategies Into Its Near-sota Performance, by Rachith Aiyappa et al.

Benchmarking zero-shot stance detection with FlanT5-XXL: Insights from training data, prompting, and decoding strategies into…

July 13, 2025

Summary of Simple Linear Attention Language Models Balance the Recall-throughput Tradeoff, by Simran Arora et al.

Simple linear attention language models balance the recall-throughput tradeoffby Simran Arora, Sabri Eyuboglu, Michael Zhang,…

July 13, 2025

Summary of The Era Of 1-bit Llms: All Large Language Models Are in 1.58 Bits, by Shuming Ma et al.

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bitsby Shuming Ma,…

July 13, 2025

Summary of Personalized Federated Instruction Tuning Via Neural Architecture Search, by Pengyu Zhang et al.

Personalized Federated Instruction Tuning via Neural Architecture Searchby Pengyu Zhang, Yingbo Zhou, Ming Hu, Junxian…

July 13, 2025

Summary of Aptq: Attention-aware Post-training Mixed-precision Quantization For Large Language Models, by Ziyi Guan et al.

APTQ: Attention-aware Post-Training Mixed-Precision Quantization for Large Language Modelsby Ziyi Guan, Hantao Huang, Yupeng Su,…

July 13, 2025

Summary of Improving Language Understanding From Screenshots, by Tianyu Gao et al.

Improving Language Understanding from Screenshotsby Tianyu Gao, Zirui Wang, Adithya Bhaskar, Danqi ChenFirst submitted to…