Summary of Why Pre-training Is Beneficial For Downstream Classification Tasks?, by Xin Jiang et al.
Why pre-training is beneficial for downstream classification tasks?
by Xin Jiang, Xu Cheng, Zechao Li
First submitted to arxiv on: 11 Oct 2024
Categories
- Main: Machine Learning (cs.LG)
- Secondary: Artificial Intelligence (cs.AI)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary This research paper proposes a novel game-theoretic approach to explain the effects of pre-training on downstream tasks in deep neural networks (DNNs). The authors extract and quantify the knowledge encoded by pre-trained models and track changes during fine-tuning. Surprisingly, they find that only a small amount of pre-trained knowledge is preserved for downstream inference, but this preserved knowledge is challenging for models training from scratch to learn. Fine-tuned models leveraging this exclusive knowledge typically outperform those training from scratch. The paper also shows how pre-training can guide the learning process, leading to faster convergence and better performance. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary Pre-training has been shown to improve accuracy and speed up convergence on downstream tasks. But why does this happen? A new study tries to figure out what’s going on. They look at the knowledge that a pre-trained model learns and how it changes when fine-tuning for a specific task. The results are interesting – only a little bit of the pre-trained knowledge is actually useful for the new task, but it’s hard for models starting from scratch to learn this. This means that fine-tuned models usually do better than those training from scratch. The study also shows how pre-training helps guide the learning process, making it faster and more effective. |
Keywords
» Artificial intelligence » Fine tuning » Inference