Summary of Reverse-engineering the Reader, by Samuel Kiegeland et al.
Reverse-Engineering the Readerby Samuel Kiegeland, Ethan Gotlieb Wilcox, Afra Amini, David Robert Reich, Ryan CotterellFirst…
Reverse-Engineering the Readerby Samuel Kiegeland, Ethan Gotlieb Wilcox, Afra Amini, David Robert Reich, Ryan CotterellFirst…
End-to-end Planner Training for Language Modelingby Nathan Cornille, Florian Mai, Jingyuan Sun, Marie-Francine MoensFirst submitted…
DAQ: Density-Aware Post-Training Weight-Only Quantization For LLMsby Yingsong Luo, Ling ChenFirst submitted to arxiv on:…
AlphaPruning: Using Heavy-Tailed Self Regularization Theory for Improved Layer-wise Pruning of Large Language Modelsby Haiquan…
Continuous Approximations for Improving Quantization Aware Training of LLMsby He Li, Jianhang Hong, Yuanzhuo Wu,…
ReLU’s Revival: On the Entropic Overload in Normalization-Free Large Language Modelsby Nandan Kumar Jha, Brandon…
CrossQuant: A Post-Training Quantization Method with Smaller Quantization Kernel for Precise Large Language Model Compressionby…
Think While You Generate: Discrete Diffusion with Planned Denoisingby Sulin Liu, Juno Nam, Andrew Campbell,…
QERA: an Analytical Framework for Quantization Error Reconstructionby Cheng Zhang, Jeffrey T. H. Wong, Can…
Language Model-Driven Data Pruning Enables Efficient Active Learningby Abdul Hameed Azeemi, Ihsan Ayyub Qazi, Agha…