Summary of Understanding and Mitigating the Label Noise in Pre-training on Downstream Tasks, by Hao Chen et al.

Understanding and Mitigating the Label Noise in Pre-training on Downstream Tasks

by Hao Chen, Jindong Wang, Ankit Shah, Ran Tao, Hongxin Wei, Xing Xie, Masashi Sugiyama, Bhiksha Raj

First submitted to arxiv on: 29 Sep 2023

GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty	Written by	Summary
High	Paper authors	High Difficulty Summary Read the original abstract here
Medium	GrooveSquid.com (original content)	Medium Difficulty Summary Pre-training on large-scale datasets and fine-tuning on downstream tasks have become a standard practice in deep learning. However, pre-training data often contain label noise that may negatively impact model generalization. This paper explores the nature of noise in pre-training datasets and mitigates its effects on downstream tasks. Specifically, experiments with supervised pre-training models on synthetic noisy ImageNet-1K and YFCC15M datasets demonstrate that slight noise can benefit in-domain transfer performance but always deteriorates out-of-domain performance. The reason is that noise shapes the feature space differently. To mitigate this effect, a lightweight black-box tuning method (NMTune) is proposed to affine the feature space. Practical experiments on popular vision and language models show the importance of Noisy Model Learning.
Low	GrooveSquid.com (original content)	Low Difficulty Summary This paper looks at how pre-training data can be noisy and affect how well AI models generalize. They tested some ideas to see if they could make these models work better, even when the training and testing data are different. They found that making small changes to the model’s features helped it do better on both types of tasks.

Keywords

* Artificial intelligence * Deep learning * Fine tuning * Generalization * Supervised

Understanding and Mitigating the Label Noise in Pre-training on Downstream Tasks

by Hao Chen, Jindong Wang, Ankit Shah, Ran Tao, Hongxin Wei, Xing Xie, Masashi Sugiyama, Bhiksha Raj

Categories

GrooveSquid.com Paper Summaries

Keywords

Summary of Masked Autoencoders Are Scalable Learners Of Cellular Morphology, by Oren Kraus et al.

Summary of Open Knowledge Base Canonicalization with Multi-task Unlearning, by Bingchen Liu et al.

Related Posts