Loading Now

Summary of Predicting the Performance Of Foundation Models Via Agreement-on-the-line, by Rahul Saxena et al.


Predicting the Performance of Foundation Models via Agreement-on-the-Line

by Rahul Saxena, Taeyoun Kim, Aman Mehra, Christina Baek, Zico Kolter, Aditi Raghunathan

First submitted to arxiv on: 2 Apr 2024

Categories

  • Main: Machine Learning (cs.LG)
  • Secondary: None

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
Medium Difficulty summary: Estimating out-of-distribution (OOD) performance for safely deploying foundation models is crucial. Recent findings on “agreement-on-the-line” in neural network ensembles can help predict OOD performance without labels. However, foundation models’ minimal finetuning from heavily pre-trained weights may reduce ensemble diversity, making agreement-on-the-line less likely to occur. This study shows that lightly finetuning multiple runs from a single foundation model can lead to drastically different levels of agreement-on-the-line due to randomness during training (linear head initialization, data ordering, and data subsetting). Surprisingly, only random head initialization reliably induces agreement-on-the-line in finetuned foundation models across vision and language benchmarks. Additionally, ensembles of multiple foundation models pre-trained on different datasets but finetuned on the same task can also show agreement-on-the-line. By constructing a diverse ensemble, this study demonstrates that agreement-on-the-line-based methods can predict OOD performance with high precision.
Low GrooveSquid.com (original content) Low Difficulty Summary
Low Difficulty summary: This research is about how to safely use big AI models, called “foundation models”. These models need to work well even when they’re used in new situations where there aren’t many examples. A recent discovery showed that groups of neural networks can be trained to predict how well they’ll do in new situations without needing more examples. However, these foundation models are different because they don’t get as much training as regular AI models. This study found that even when foundation models get a little bit of extra training, the way they’re trained can make them work very differently. Surprisingly, only one way of training makes them work well in many situations. The study also shows that combining multiple foundation models trained on different things but doing the same task can also make them work well. By using these techniques, we can predict how well big AI models will do in new situations with high accuracy.

Keywords

* Artificial intelligence  * Neural network  * Precision