Summary of Label Noise: Ignorance Is Bliss, by Yilun Zhu et al.
Label Noise: Ignorance Is Bliss
by Yilun Zhu, Jianxin Zhang, Aditya Gangrade, Clayton Scott
First submitted to arxiv on: 31 Oct 2024
Categories
- Main: Machine Learning (cs.LG)
- Secondary: Machine Learning (stat.ML)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary This paper introduces a new theoretical framework for learning under multi-class instance-dependent label noise. The framework views label noise as domain adaptation, specifically posterior drift domain adaptation. It also proposes the concept of relative signal strength (RSS), a measure quantifying transferability from noisy to clean posteriors. Using RSS, the authors establish nearly matching upper and lower bounds on excess risk. The findings support the simple Noise Ignorant Empirical Risk Minimization (NI-ERM) principle, which minimizes empirical risk while ignoring label noise. To translate this insight into practice, the authors use NI-ERM to fit a linear classifier on top of a self-supervised feature extractor, achieving state-of-the-art performance on the CIFAR-N data challenge. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary This paper helps us understand how computers can learn from noisy information. Noisy information is when some labels are wrong or incorrect. The researchers created a new way to look at this problem by comparing it to another similar situation called domain adaptation. They also came up with a way to measure how good the learning process is based on how well it performs on clean data. Their results showed that ignoring the noise and just focusing on getting the best results worked better than trying to fix the noisy information first. This new approach was then used to train a computer model to be really good at recognizing objects from pictures, beating previous records. |
Keywords
» Artificial intelligence » Domain adaptation » Self supervised » Transferability