Loading Now

Summary of Changing the Training Data Distribution to Reduce Simplicity Bias Improves In-distribution Generalization, by Dang Nguyen et al.


Changing the Training Data Distribution to Reduce Simplicity Bias Improves In-distribution Generalization

by Dang Nguyen, Paymon Haddad, Eric Gan, Baharan Mirzasoleiman

First submitted to arxiv on: 27 Apr 2024

Categories

  • Main: Machine Learning (cs.LG)
  • Secondary: Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
The abstract proposes a novel approach to modify the training data distribution to improve generalization performance by encouraging the optimization method to find solutions with superior in-distribution data performance. The authors compare the inductive bias of gradient descent (GD) with sharpness-aware minimization (SAM), showing that SAM learns features more uniformly and is less susceptible to simplicity bias. A proposed method, USEFUL, clusters examples based on network output early in training, identifies a cluster of similar outputs, and upsamples the rest to alleviate simplicity bias. Empirical results demonstrate improved generalization performance with various gradient methods, including (S)GD and SAM.
Low GrooveSquid.com (original content) Low Difficulty Summary
Can we make machines learn better? Researchers tried to answer this question by comparing two ways computers train: gradient descent (GD) and sharpness-aware minimization (SAM). They found that SAM is better at learning features and not getting stuck in simple solutions. The team proposed a new method, USEFUL, to help computers generalize better on unknown data. This approach clusters similar examples together and upsamples other examples to make the computer learn more. Scientists tested this method with different machines and datasets, achieving state-of-the-art results.

Keywords

» Artificial intelligence  » Generalization  » Gradient descent  » Optimization  » Sam