Summary of Bayes Without Underfitting: Fully Correlated Deep Learning Posteriors Via Alternating Projections, by Marco Miani et al.

Bayes without Underfitting: Fully Correlated Deep Learning Posteriors via Alternating Projections

by Marco Miani, Hrittik Roy, Søren Hauberg

First submitted to arxiv on: 22 Oct 2024

GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty	Written by	Summary
High	Paper authors	High Difficulty Summary Read the original abstract here
Medium	GrooveSquid.com (original content)	Medium Difficulty Summary Bayesian deep learning often underfits, resulting in less accurate predictions than simple point estimates. To quantify uncertainty without sacrificing accuracy, we propose building Bayesian approximations within the null space of the generalized Gauss-Newton matrix for linearized models. This approach ensures that the Bayesian predictive does not underfit and provides a matrix-free algorithm for projection onto this space, scaling linearly with parameters and quadratically with output dimensions. We also suggest an approximation that scales only linearly with parameters, making it applicable to generative models. Our extensive empirical evaluation demonstrates that this method can be applied to large models, including vision transformers with 28 million parameters.
Low	GrooveSquid.com (original content)	Low Difficulty Summary Bayesian deep learning sometimes makes predictions that are not very good. This happens when the model is too simple and doesn’t fit the data well enough. To fix this, we found a way to build better Bayesian models by using a special space in the null space of a matrix called the generalized Gauss-Newton matrix. This ensures our predictions won’t be too simple and will be more accurate. We also created an algorithm that makes it easy to use this method and scales well with large models.

Keywords

* Artificial intelligence * Deep learning

Bayes without Underfitting: Fully Correlated Deep Learning Posteriors via Alternating Projections

by Marco Miani, Hrittik Roy, Søren Hauberg

Categories

GrooveSquid.com Paper Summaries

Keywords

Summary of Unsupervised Time Series Anomaly Prediction with Importance-based Generative Contrastive Learning, by Kai Zhao et al.

Summary of Can General-purpose Large Language Models Generalize to English-thai Machine Translation ?, by Jirat Chiaranaipanich et al.

Related Posts