Scaling laws – Page 5 – GrooveSquid.com

July 13, 2025

Adaptive Data Optimization: Dynamic Sample Selection with Scaling Lawsby Yiding Jiang, Allan Zhou, Zhili Feng,…

July 13, 2025

Analyzing Neural Scaling Laws in Two-Layer Networks with Power-Law Data Spectraby Roman Worschech, Bernd RosenowFirst…

July 13, 2025

Scaling Laws Across Model Architectures: A Comparative Analysis of Dense and MoE Models in Large…

July 13, 2025

Searching for Efficient Linear Layers over a Continuous Space of Structured Matricesby Andres Potapczynski, Shikai…

July 13, 2025

Neural Scaling Laws of Deep ReLU and Deep Operator Network: A Theoretical Studyby Hao Liu,…

July 13, 2025

How Feature Learning Can Improve Neural Scaling Lawsby Blake Bordelon, Alexander Atanasov, Cengiz PehlevanFirst submitted…

July 13, 2025

Rethinking Conventional Wisdom in Machine Learning: From Generalization to Scalingby Lechao XiaoFirst submitted to arxiv…

July 13, 2025

Exploring Scaling Laws for Local SGD in Large Language Model Trainingby Qiaozhi He, Xiaomin Zhuang,…

July 13, 2025

Provable In-Context Learning of Linear Systems and Linear Elliptic PDEs with Transformersby Frank Cole, Yulong…

July 13, 2025

Scaling Law Hypothesis for Multimodal Modelby Qingyun Sun, Zhen Guo, PIN AI TeamFirst submitted to…