Summary of Changing the Training Data Distribution to Reduce Simplicity Bias Improves In-distribution Generalization, by Dang Nguyen et al.
Changing the Training Data Distribution to Reduce Simplicity Bias Improves In-distribution Generalization
by Dang Nguyen, Paymon Haddad, Eric Gan, Baharan Mirzasoleiman
First submitted to arxiv on: 27 Apr 2024
Categories
- Main: Machine Learning (cs.LG)
- Secondary: Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary The abstract proposes a novel approach to modify the training data distribution to improve generalization performance by encouraging the optimization method to find solutions with superior in-distribution data performance. The authors compare the inductive bias of gradient descent (GD) with sharpness-aware minimization (SAM), showing that SAM learns features more uniformly and is less susceptible to simplicity bias. A proposed method, USEFUL, clusters examples based on network output early in training, identifies a cluster of similar outputs, and upsamples the rest to alleviate simplicity bias. Empirical results demonstrate improved generalization performance with various gradient methods, including (S)GD and SAM. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary Can we make machines learn better? Researchers tried to answer this question by comparing two ways computers train: gradient descent (GD) and sharpness-aware minimization (SAM). They found that SAM is better at learning features and not getting stuck in simple solutions. The team proposed a new method, USEFUL, to help computers generalize better on unknown data. This approach clusters similar examples together and upsamples other examples to make the computer learn more. Scientists tested this method with different machines and datasets, achieving state-of-the-art results. |
Keywords
» Artificial intelligence » Generalization » Gradient descent » Optimization » Sam