Loading Now

Summary of Stochastic Modified Flows For Riemannian Stochastic Gradient Descent, by Benjamin Gess et al.


Stochastic Modified Flows for Riemannian Stochastic Gradient Descent

by Benjamin Gess, Sebastian Kassing, Nimit Rana

First submitted to arxiv on: 2 Feb 2024

Categories

  • Main: Machine Learning (cs.LG)
  • Secondary: Optimization and Control (math.OC); Probability (math.PR); Machine Learning (stat.ML)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
Riemannian stochastic gradient descent (RSGD) is a machine learning algorithm that converges to Riemannian gradient flow and a diffusion process called Riemannian stochastic modified flow (RSMF). Researchers used tools from stochastic differential geometry to show that, in the small learning rate regime, RSGD can be approximated by the solution to RSMF driven by an infinite-dimensional Wiener process. The RSMF accounts for random fluctuations of RSGD, increasing the order of approximation compared to deterministic Riemannian gradient flow. The algorithm relies on retraction maps, efficient approximations of exponential maps, and researchers proved quantitative bounds for weak error under assumptions on retraction maps, manifold geometry, and random gradient estimators.
Low GrooveSquid.com (original content) Low Difficulty Summary
Riemannian stochastic gradient descent is a way computers learn from data that involves complex math. Scientists studied how well this method works by looking at its rate of convergence to two different processes. They found that when the learning rate is small, the algorithm can be approximated by a process called Riemannian stochastic modified flow, which takes into account random fluctuations. This improvement helps the algorithm learn more efficiently. The researchers used special tools from geometry and statistics to prove their findings.

Keywords

* Artificial intelligence  * Diffusion  * Machine learning  * Stochastic gradient descent