Loading Now

Summary of Ask Your Distribution Shift If Pre-training Is Right For You, by Benjamin Cohen-wang et al.


Ask Your Distribution Shift if Pre-Training is Right for You

by Benjamin Cohen-Wang, Joshua Vendrow, Aleksander Madry

First submitted to arxiv on: 29 Feb 2024

Categories

  • Main: Machine Learning (cs.LG)
  • Secondary: None

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
This paper investigates the limitations of pre-training models in addressing distribution shifts. Pre-training is a widely used approach to develop robust models, but its effectiveness varies greatly depending on the scenario. The authors focus on two specific failure modes: poor extrapolation and biases in the training data. They find that pre-training can mitigate poor extrapolation but not dataset biases. This has implications for developing robust models, suggesting that pre-training and interventions designed to prevent exploiting biases have complementary benefits, and fine-tuning on a small, de-biased dataset can result in more robust models than fine-tuning on a large, biased dataset. The authors provide theoretical motivation and empirical evidence for this finding, making it relevant to the development of robust machine learning models.
Low GrooveSquid.com (original content) Low Difficulty Summary
This paper looks at how well pre-trained models work when they’re used in different situations. It turns out that pre-training can help make models more generalizable to new data, but it doesn’t fix problems with biased training data. The authors suggest that we should use a combination of pre-training and techniques to prevent models from relying on biases to get better results. They also find that fine-tuning a model on a small amount of de-biased data can make it more robust than fine-tuning it on a large amount of biased data.

Keywords

* Artificial intelligence  * Fine tuning  * Machine learning