Summary of Exponential Moving Average Of Weights in Deep Learning: Dynamics and Benefits, by Daniel Morales-brotons et al.
Exponential Moving Average of Weights in Deep Learning: Dynamics and Benefitsby Daniel Morales-Brotons, Thijs Vogels,…
Exponential Moving Average of Weights in Deep Learning: Dynamics and Benefitsby Daniel Morales-Brotons, Thijs Vogels,…
Distributed Sign Momentum with Local Steps for Training Transformersby Shuhua Yu, Ding Zhou, Cong Xie,…
Fast training of large kernel models with delayed projectionsby Amirhesam Abedsoltan, Siyuan Ma, Parthe Pandit,…
Differentially Private Learning Beyond the Classical Dimensionality Regimeby Cynthia Dwork, Pranay Tankala, Linjun ZhangFirst submitted…
A Unified Analysis for Finite Weight Averagingby Peng Wang, Li Shen, Zerui Tao, Yan Sun,…
General framework for online-to-nonconvex conversion: Schedule-free SGD is also effective for nonconvex optimizationby Kwangjun Ahn,…
An Energy-Based Self-Adaptive Learning Rate for Stochastic Gradient Descent: Enhancing Unconstrained Optimization with VAV methodby…
Impact of Label Noise on Learning Complex Featuresby Rahul Vashisht, P. Krishna Kumar, Harsha Vardhan…
Statistical-Computational Trade-offs for Recursive Adaptive Partitioning Estimatorsby Yan Shuo Tan, Jason M. Klusowski, Krishnakumar BalasubramanianFirst…
Scalable DP-SGD: Shuffling vs. Poisson Subsamplingby Lynn Chua, Badih Ghazi, Pritish Kamath, Ravi Kumar, Pasin Manurangsi,…