Summary of Mechanistic Design and Scaling Of Hybrid Architectures, by Michael Poli et al.
Mechanistic Design and Scaling of Hybrid Architecturesby Michael Poli, Armin W Thomas, Eric Nguyen, Pragaash…
Mechanistic Design and Scaling of Hybrid Architecturesby Michael Poli, Armin W Thomas, Eric Nguyen, Pragaash…
Language models scale reliably with over-training and on downstream tasksby Samir Yitzhak Gadre, Georgios Smyrnis,…
Mixtures of Experts Unlock Parameter Scaling for Deep RLby Johan Obando-Ceron, Ghada Sokar, Timon Willi,…
Scaling Laws for Fine-Grained Mixture of Expertsby Jakub Krajewski, Jan Ludziejewski, Kamil Adamczewski, Maciej Pióro,…
Model Collapse Demystified: The Case of Regressionby Elvis Dohmatob, Yunzhen Feng, Julia KempeFirst submitted to…
A Tale of Tails: Model Collapse as a Change of Scaling Lawsby Elvis Dohmatob, Yunzhen…
A Resource Model For Neural Scaling Lawby Jinyeop Song, Ziming Liu, Max Tegmark, Jeff GoreFirst…
Scaling Laws for Downstream Task Performance in Machine Translationby Berivan Isik, Natalia Ponomareva, Hussein Hazimeh,…
Position: Graph Foundation Models are Already Hereby Haitao Mao, Zhikai Chen, Wenzhuo Tang, Jianan Zhao,…
Towards Neural Scaling Laws on Graphsby Jingzhe Liu, Haitao Mao, Zhikai Chen, Tong Zhao, Neil…