Summary of Bayes Without Underfitting: Fully Correlated Deep Learning Posteriors Via Alternating Projections, by Marco Miani et al.
Bayes without Underfitting: Fully Correlated Deep Learning Posteriors via Alternating Projections
by Marco Miani, Hrittik Roy, Søren Hauberg
First submitted to arxiv on: 22 Oct 2024
Categories
- Main: Machine Learning (cs.LG)
- Secondary: Machine Learning (stat.ML)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary Bayesian deep learning often underfits, resulting in less accurate predictions than simple point estimates. To quantify uncertainty without sacrificing accuracy, we propose building Bayesian approximations within the null space of the generalized Gauss-Newton matrix for linearized models. This approach ensures that the Bayesian predictive does not underfit and provides a matrix-free algorithm for projection onto this space, scaling linearly with parameters and quadratically with output dimensions. We also suggest an approximation that scales only linearly with parameters, making it applicable to generative models. Our extensive empirical evaluation demonstrates that this method can be applied to large models, including vision transformers with 28 million parameters. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary Bayesian deep learning sometimes makes predictions that are not very good. This happens when the model is too simple and doesn’t fit the data well enough. To fix this, we found a way to build better Bayesian models by using a special space in the null space of a matrix called the generalized Gauss-Newton matrix. This ensures our predictions won’t be too simple and will be more accurate. We also created an algorithm that makes it easy to use this method and scales well with large models. |
Keywords
* Artificial intelligence * Deep learning