Summary of A Nearly Optimal Single Loop Algorithm For Stochastic Bilevel Optimization Under Unbounded Smoothness, by Xiaochuan Gong et al.
A Nearly Optimal Single Loop Algorithm for Stochastic Bilevel Optimization under Unbounded Smoothness
by Xiaochuan Gong, Jie Hao, Mingrui Liu
First submitted to arxiv on: 28 Dec 2024
Categories
- Main: Machine Learning (cs.LG)
- Secondary: Optimization and Control (math.OC)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary The paper proposes a new algorithm called Single Loop bIlevel oPtimizer (SLIP) to solve the problem of stochastic bilevel optimization, where the upper-level function is nonconvex and potentially unbounded, and the lower-level function is strongly convex. This problem is motivated by meta-learning applied to sequential data, such as text classification using recurrent neural networks. The proposed algorithm updates the variables simultaneously using normalized stochastic gradient descent with momentum and stochastic gradient descent, and it finds an -stationary point within (1/^4) oracle calls of stochastic gradient or Hessian-vector product. This complexity result is nearly optimal up to logarithmic factors without mean-square smoothness of the stochastic gradient oracle. The experiments on various tasks show that our algorithm significantly outperforms strong baselines in bilevel optimization. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary The paper solves a problem called stochastic bilevel optimization, where one function depends on another function. This problem is important for training artificial intelligence models to learn from data. The paper proposes a new way to solve this problem using an algorithm called SLIP. This algorithm is good at finding the right answer and it works well with big datasets. The paper also shows that its method is better than other methods in some cases. |
Keywords
» Artificial intelligence » Meta learning » Optimization » Stochastic gradient descent » Text classification