Loading Now

Summary of Understanding and Mitigating the Label Noise in Pre-training on Downstream Tasks, by Hao Chen et al.


Understanding and Mitigating the Label Noise in Pre-training on Downstream Tasks

by Hao Chen, Jindong Wang, Ankit Shah, Ran Tao, Hongxin Wei, Xing Xie, Masashi Sugiyama, Bhiksha Raj

First submitted to arxiv on: 29 Sep 2023

Categories

  • Main: Machine Learning (cs.LG)
  • Secondary: Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
Pre-training on large-scale datasets and fine-tuning on downstream tasks have become a standard practice in deep learning. However, pre-training data often contain label noise that may negatively impact model generalization. This paper explores the nature of noise in pre-training datasets and mitigates its effects on downstream tasks. Specifically, experiments with supervised pre-training models on synthetic noisy ImageNet-1K and YFCC15M datasets demonstrate that slight noise can benefit in-domain transfer performance but always deteriorates out-of-domain performance. The reason is that noise shapes the feature space differently. To mitigate this effect, a lightweight black-box tuning method (NMTune) is proposed to affine the feature space. Practical experiments on popular vision and language models show the importance of Noisy Model Learning.
Low GrooveSquid.com (original content) Low Difficulty Summary
This paper looks at how pre-training data can be noisy and affect how well AI models generalize. They tested some ideas to see if they could make these models work better, even when the training and testing data are different. They found that making small changes to the model’s features helped it do better on both types of tasks.

Keywords

* Artificial intelligence  * Deep learning  * Fine tuning  * Generalization  * Supervised