Summary of Flat Posterior Does Matter For Bayesian Model Averaging, by Sungjun Lim et al.

Flat Posterior Does Matter For Bayesian Model Averaging

by Sungjun Lim, Jeyoon Yeom, Sooyon Kim, Hoyoon Byun, Jinho Kang, Yohan Jung, Jiyoung Jung, Kyungwoo Song

First submitted to arxiv on: 21 Jun 2024

GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty	Written by	Summary
High	Paper authors	High Difficulty Summary Read the original abstract here
Medium	GrooveSquid.com (original content)	Medium Difficulty Summary Bayesian neural networks (BNNs) estimate the posterior distribution of model parameters and utilize posterior samples for Bayesian Model Averaging (BMA) in prediction. However, despite the crucial role of flatness in the loss landscape in improving the generalization of neural networks, its impact on BMA has been largely overlooked. The paper explores how posterior flatness influences BMA generalization and demonstrates that most approximate Bayesian inference methods fail to yield a flat posterior. It also shows that BMA predictions without considering posterior flatness are less effective at improving generalization. To address this, the authors propose Flat Posterior-aware Bayesian Model Averaging (FP-BMA), a novel training objective that encourages flat posteriors in a principled Bayesian manner. Additionally, they introduce a Flat Posterior-aware Bayesian Transfer Learning scheme that enhances generalization in downstream tasks. The paper empirically shows that FP-BMA successfully captures flat posteriors, improving generalization performance.
Low	GrooveSquid.com (original content)	Low Difficulty Summary Bayesian neural networks help us predict things by looking at the possible answers and saying how likely each one is. But sometimes these predictions aren’t very good because the way we’re doing it isn’t considering something important – the shape of the predictions themselves. This paper looks at this problem and finds that most ways we’re currently doing Bayesian Model Averaging (BMA) are missing this important detail. The authors propose a new way to do BMA that takes into account the shape of the predictions, which helps them be more accurate.

Keywords

» Artificial intelligence » Bayesian inference » Generalization » Transfer learning

Flat Posterior Does Matter For Bayesian Model Averaging

by Sungjun Lim, Jeyoon Yeom, Sooyon Kim, Hoyoon Byun, Jinho Kang, Yohan Jung, Jiyoung Jung, Kyungwoo Song

Categories

GrooveSquid.com Paper Summaries

Keywords

Summary of Testing the Feasibility Of Linear Programs with Bandit Feedback, by Aditya Gangrade et al.

Summary of Synergistic Deep Graph Clustering Network, by Benyu Wu et al.

Related Posts