Summary of An Accelerated Algorithm For Stochastic Bilevel Optimization Under Unbounded Smoothness, by Xiaochuan Gong et al.
An Accelerated Algorithm for Stochastic Bilevel Optimization under Unbounded Smoothness
by Xiaochuan Gong, Jie Hao, Mingrui Liu
First submitted to arxiv on: 28 Sep 2024
Categories
- Main: Machine Learning (cs.LG)
- Secondary: Optimization and Control (math.OC)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary This paper focuses on solving stochastic bilevel optimization problems, a class of issues crucial for sequential data learning applications like text classification using recurrent neural networks. The upper-level function is non-convex with unbounded smoothness, while the lower-level problem is strongly convex. Existing algorithms require up to (1/^4) oracle calls to find an -stationary point. To improve this rate, the authors propose AccBO, an accelerated algorithm that updates variables using normalized stochastic gradient descent with recursive momentum and stochastic Nesterov accelerated gradient descent with averaging. The paper proves that AccBO achieves an oracle complexity of (1/^3) while handling lower-level stochastic gradient variance O(). Theoretical analysis relies on a novel lemma characterizing the dynamics of stochastic Nesterov accelerated gradient descent under distribution drift, which also plays a crucial role in analyzing hypergradient estimation error. Experimental results confirm the algorithm’s predicted acceleration and outperformance over baselines. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary This paper is about solving a special type of math problem that helps computers learn from data, like recognizing text patterns. The problem involves two parts: one is easy to solve, while the other is harder. The authors want to make it faster and better at finding the right solution. They come up with a new algorithm called AccBO that makes it work faster by using different methods to update the variables. The paper shows that this algorithm works well in practice and can even outdo previous solutions. |
Keywords
» Artificial intelligence » Gradient descent » Optimization » Stochastic gradient descent » Text classification