Loading Now

Summary of Flat Posterior Does Matter For Bayesian Model Averaging, by Sungjun Lim et al.


Flat Posterior Does Matter For Bayesian Model Averaging

by Sungjun Lim, Jeyoon Yeom, Sooyon Kim, Hoyoon Byun, Jinho Kang, Yohan Jung, Jiyoung Jung, Kyungwoo Song

First submitted to arxiv on: 21 Jun 2024

Categories

  • Main: Machine Learning (stat.ML)
  • Secondary: Machine Learning (cs.LG)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
Bayesian neural networks (BNNs) estimate the posterior distribution of model parameters and utilize posterior samples for Bayesian Model Averaging (BMA) in prediction. However, despite the crucial role of flatness in the loss landscape in improving the generalization of neural networks, its impact on BMA has been largely overlooked. The paper explores how posterior flatness influences BMA generalization and demonstrates that most approximate Bayesian inference methods fail to yield a flat posterior. It also shows that BMA predictions without considering posterior flatness are less effective at improving generalization. To address this, the authors propose Flat Posterior-aware Bayesian Model Averaging (FP-BMA), a novel training objective that encourages flat posteriors in a principled Bayesian manner. Additionally, they introduce a Flat Posterior-aware Bayesian Transfer Learning scheme that enhances generalization in downstream tasks. The paper empirically shows that FP-BMA successfully captures flat posteriors, improving generalization performance.
Low GrooveSquid.com (original content) Low Difficulty Summary
Bayesian neural networks help us predict things by looking at the possible answers and saying how likely each one is. But sometimes these predictions aren’t very good because the way we’re doing it isn’t considering something important – the shape of the predictions themselves. This paper looks at this problem and finds that most ways we’re currently doing Bayesian Model Averaging (BMA) are missing this important detail. The authors propose a new way to do BMA that takes into account the shape of the predictions, which helps them be more accurate.

Keywords

» Artificial intelligence  » Bayesian inference  » Generalization  » Transfer learning