Summary of Central Limit Theorem For Bayesian Neural Network Trained with Variational Inference, by Arnaud Descours (magnet) et al.
Central Limit Theorem for Bayesian Neural Network trained with Variational Inference
by Arnaud Descours, Tom Huix, Arnaud Guillin, Manon Michel, Éric Moulines, Boris Nectoux
First submitted to arxiv on: 10 Jun 2024
Categories
- Main: Machine Learning (stat.ML)
- Secondary: Machine Learning (cs.LG); Probability (math.PR); Statistics Theory (math.ST)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary The paper rigorously derives Central Limit Theorems (CLT) for Bayesian two-layer neural networks in the infinite-width limit and trained by variational inference on a regression task. The networks are trained via different maximization schemes of the regularized evidence lower bound, including idealized estimation, minibatch sampling using Monte Carlo, and Minimal VI. Laws of large numbers are proven for each scheme, showing that the idealized and Bayes-by-Backprop schemes have similar fluctuation behavior, whereas Minimal VI has a different one. The paper demonstrates that Minimal VI is more efficient despite having bigger variances due to its computational complexity gain. This work contributes to our understanding of Bayesian neural networks by providing mathematical guarantees for their performance. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary This paper helps us understand how big neural networks can be trained using special techniques called variational inference and minimal variance. It shows that these networks behave similarly when trained in different ways, but one way is more efficient even though it’s not as good at predicting individual results. The researchers used math to prove this and then tested their ideas with computer simulations. |
Keywords
» Artificial intelligence » Inference » Regression