Loading Now

Summary of An Accelerated Algorithm For Stochastic Bilevel Optimization Under Unbounded Smoothness, by Xiaochuan Gong et al.


An Accelerated Algorithm for Stochastic Bilevel Optimization under Unbounded Smoothness

by Xiaochuan Gong, Jie Hao, Mingrui Liu

First submitted to arxiv on: 28 Sep 2024

Categories

  • Main: Machine Learning (cs.LG)
  • Secondary: Optimization and Control (math.OC)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
This paper focuses on solving stochastic bilevel optimization problems, a class of issues crucial for sequential data learning applications like text classification using recurrent neural networks. The upper-level function is non-convex with unbounded smoothness, while the lower-level problem is strongly convex. Existing algorithms require up to (1/^4) oracle calls to find an -stationary point. To improve this rate, the authors propose AccBO, an accelerated algorithm that updates variables using normalized stochastic gradient descent with recursive momentum and stochastic Nesterov accelerated gradient descent with averaging. The paper proves that AccBO achieves an oracle complexity of (1/^3) while handling lower-level stochastic gradient variance O(). Theoretical analysis relies on a novel lemma characterizing the dynamics of stochastic Nesterov accelerated gradient descent under distribution drift, which also plays a crucial role in analyzing hypergradient estimation error. Experimental results confirm the algorithm’s predicted acceleration and outperformance over baselines.
Low GrooveSquid.com (original content) Low Difficulty Summary
This paper is about solving a special type of math problem that helps computers learn from data, like recognizing text patterns. The problem involves two parts: one is easy to solve, while the other is harder. The authors want to make it faster and better at finding the right solution. They come up with a new algorithm called AccBO that makes it work faster by using different methods to update the variables. The paper shows that this algorithm works well in practice and can even outdo previous solutions.

Keywords

» Artificial intelligence  » Gradient descent  » Optimization  » Stochastic gradient descent  » Text classification