Summary of Limit Theorems For Stochastic Gradient Descent with Infinite Variance, by Jose Blanchet et al.
Limit Theorems for Stochastic Gradient Descent with Infinite Variance
by Jose Blanchet, Aleksandar Mijatović, Wenhao Yang
First submitted to arxiv on: 21 Oct 2024
Categories
- Main: Machine Learning (stat.ML)
- Secondary: Machine Learning (cs.LG); Probability (math.PR)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary This paper investigates the theoretical properties of stochastic gradient descent (SGD) when dealing with infinite variance gradients. While SGD has been extensively studied under finite variance assumptions, there is a lack of research on its behavior with infinite variance gradients. The authors establish the asymptotic behavior of SGD in this context, assuming that the stochastic gradient is regular varying with index α ∈ (1, 2). They extend previous results from 1969 to the multidimensional case and a broader class of infinite variance distributions. The authors also demonstrate the applications of these results in linear regression and logistic regression models. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary This paper looks at how an important machine learning algorithm called stochastic gradient descent works when it’s dealing with uncertain or noisy data. Right now, we don’t have a good understanding of how this algorithm performs when the uncertainty is really high. The researchers in this paper figure out what happens to the algorithm when the noise is very big, and they show that it can be described using a type of mathematical process called an Ornstein-Uhlenbeck process. They also explain how these results can be used to improve machine learning models for tasks like predicting things based on how likely they are to happen. |
Keywords
» Artificial intelligence » Linear regression » Logistic regression » Machine learning » Stochastic gradient descent