Summary of Combating Semantic Contamination in Learning with Label Noise, by Wenxiao Fan et al.
Combating Semantic Contamination in Learning with Label Noise
by Wenxiao Fan, Kan Li
First submitted to arxiv on: 16 Dec 2024
Categories
- Main: Computer Vision and Pattern Recognition (cs.CV)
- Secondary: Artificial Intelligence (cs.AI)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary In this paper, researchers tackle a critical issue in deep learning: noisy labels can significantly degrade model performance. They identify a common problem called “Semantic Contamination” that occurs when methods for reconstructing noisy labels introduce unwanted associations between classes. The authors analyze a representative label refurbishment method, Robust LR, and find it is prone to Semantic Contamination. To address this issue, they propose a new approach called Collaborative Cross Learning, which uses semi-supervised learning on refurbished labels to extract correct semantic relationships from embeddings across views and models. Experimental results show that their method outperforms existing approaches on both synthetic and real-world datasets. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary Label noise is a big problem in deep learning. When the labels we use to train our models are noisy, it can make the models perform poorly. The authors of this paper found that some methods for fixing noisy labels can actually make things worse by introducing “Semantic Contamination”. This happens when the method tries to fix the label, but ends up linking classes together in a way that’s not helpful. The authors looked at one popular method called Robust LR and found that it has this problem. To solve this issue, they came up with a new approach that uses semi-supervised learning to get the correct relationships between classes from the data itself. This new method did better than existing approaches on both made-up and real-world datasets. |
Keywords
» Artificial intelligence » Deep learning » Semi supervised