Summary of Anytime Acceleration Of Gradient Descent, by Zihan Zhang et al.
Anytime Acceleration of Gradient Descentby Zihan Zhang, Jason D. Lee, Simon S. Du, Yuxin ChenFirst…
Anytime Acceleration of Gradient Descentby Zihan Zhang, Jason D. Lee, Simon S. Du, Yuxin ChenFirst…
Transformers are Deep Optimizers: Provable In-Context Learning for Deep Model Trainingby Weimin Wu, Maojiang Su,…
Stability properties of gradient flow dynamics for the symmetric low-rank matrix factorization problemby Hesameddin Mohammadi,…
Broad Critic Deep Actor Reinforcement Learning for Continuous Controlby Shiron Thalagala, Pak Kin Wong, Xiaozheng…
Gradient dynamics for low-rank fine-tuning beyond kernelsby Arif Kerem Dayi, Sitan ChenFirst submitted to arxiv…
Applications of fractional calculus in learned optimizationby Teodor Alexandru Szente, James Harrison, Mihai Zanfir, Cristian…
Learning Differentiable Surrogate Losses for Structured Predictionby Junjie Yang, Matthieu Labeau, Florence d'Alché-BucFirst submitted to…
One-Layer Transformer Provably Learns One-Nearest Neighbor In Contextby Zihao Li, Yuan Cao, Cheng Gao, Yihan…
Unraveling the Gradient Descent Dynamics of Transformersby Bingqing Song, Boran Han, Shuai Zhang, Jie Ding,…
Non-Adversarial Inverse Reinforcement Learning via Successor Feature Matchingby Arnav Kumar Jain, Harley Wiltzer, Jesse Farebrother,…