Loading Now

Summary of Convergence Rate Analysis Of Lion, by Yiming Dong et al.


Convergence Rate Analysis of LION

by Yiming Dong, Huan Li, Zhouchen Lin

First submitted to arxiv on: 12 Nov 2024

Categories

  • Main: Machine Learning (cs.LG)
  • Secondary: Optimization and Control (math.OC)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
A newly discovered optimizer, LION (evoLved sIgn mOmeNtum), has shown impressive performance in training large-scale deep neural networks. Google found this simple sign update-based optimizer through program search. While previous studies have explored its convergence properties, a comprehensive analysis of the convergence rate is still lacking. This paper fills this gap by demonstrating LION’s convergence to the Karush-Kuhn-Tucker (KKT) point at a rate of O(sqrt(d)K^-1/4) measured by gradient L1 norm. The authors also remove constraints and show that LION converges to the critical point of the general unconstrained problem at the same rate. This rate matches the theoretical lower bound for nonconvex stochastic optimization algorithms, which is typically measured using the gradient L2 norm. Through extensive experiments, LION achieves lower loss and higher performance compared to standard SGD.
Low GrooveSquid.com (original content) Low Difficulty Summary
LION is a new optimizer that helps train big neural networks quickly and accurately. Google found it through a search of programs. People have studied its properties before, but nobody has looked at how fast it converges yet. This paper shows that LION gets close to the best answer (called the KKT point) really quickly – in fact, it’s as fast as it can be for a problem of this type. It also works well without any constraints. The authors tested LION and found that it does better than another popular optimizer called SGD.

Keywords

* Artificial intelligence  * Optimization