Summary of What Variables Affect Out-of-distribution Generalization in Pretrained Models?, by Md Yousuf Harun et al.
What Variables Affect Out-of-Distribution Generalization in Pretrained Models?
by Md Yousuf Harun, Kyungbok Lee, Jhair Gallardo, Giri Krishnan, Christopher Kanan
First submitted to arxiv on: 23 May 2024
Categories
- Main: Machine Learning (cs.LG)
- Secondary: Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary Pre-trained deep neural networks (DNNs) are widely used for downstream tasks, but their efficacy can vary significantly. This paper investigates the factors influencing the transferability and out-of-distribution generalization of pre-trained DNN embeddings through the lens of the tunnel effect hypothesis. Contrary to earlier work, our experiments show that this is not a universal phenomenon. Instead, we find that training with high-resolution datasets containing many classes reduces representation compression and improves transferability. Our results emphasize the danger of generalizing findings from toy datasets to broader contexts. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary Pre-trained deep neural networks (DNNs) are used for many tasks, but how well they work can vary a lot. This paper looks at what makes them good or bad at doing other tasks. It turns out that earlier ideas about how DNNs work weren’t always right. Instead, we found that training with lots of images and classes helps make the DNNs better at doing new things. This matters because it means we can’t just use small test sets to figure out if our results are real. |
Keywords
» Artificial intelligence » Generalization » Transferability