Loading Now

Summary of Why Pre-training Is Beneficial For Downstream Classification Tasks?, by Xin Jiang et al.


Why pre-training is beneficial for downstream classification tasks?

by Xin Jiang, Xu Cheng, Zechao Li

First submitted to arxiv on: 11 Oct 2024

Categories

  • Main: Machine Learning (cs.LG)
  • Secondary: Artificial Intelligence (cs.AI)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
This research paper proposes a novel game-theoretic approach to explain the effects of pre-training on downstream tasks in deep neural networks (DNNs). The authors extract and quantify the knowledge encoded by pre-trained models and track changes during fine-tuning. Surprisingly, they find that only a small amount of pre-trained knowledge is preserved for downstream inference, but this preserved knowledge is challenging for models training from scratch to learn. Fine-tuned models leveraging this exclusive knowledge typically outperform those training from scratch. The paper also shows how pre-training can guide the learning process, leading to faster convergence and better performance.
Low GrooveSquid.com (original content) Low Difficulty Summary
Pre-training has been shown to improve accuracy and speed up convergence on downstream tasks. But why does this happen? A new study tries to figure out what’s going on. They look at the knowledge that a pre-trained model learns and how it changes when fine-tuning for a specific task. The results are interesting – only a little bit of the pre-trained knowledge is actually useful for the new task, but it’s hard for models starting from scratch to learn this. This means that fine-tuned models usually do better than those training from scratch. The study also shows how pre-training helps guide the learning process, making it faster and more effective.

Keywords

» Artificial intelligence  » Fine tuning  » Inference