Loading Now

Summary of Combating Semantic Contamination in Learning with Label Noise, by Wenxiao Fan et al.


Combating Semantic Contamination in Learning with Label Noise

by Wenxiao Fan, Kan Li

First submitted to arxiv on: 16 Dec 2024

Categories

  • Main: Computer Vision and Pattern Recognition (cs.CV)
  • Secondary: Artificial Intelligence (cs.AI)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
In this paper, researchers tackle a critical issue in deep learning: noisy labels can significantly degrade model performance. They identify a common problem called “Semantic Contamination” that occurs when methods for reconstructing noisy labels introduce unwanted associations between classes. The authors analyze a representative label refurbishment method, Robust LR, and find it is prone to Semantic Contamination. To address this issue, they propose a new approach called Collaborative Cross Learning, which uses semi-supervised learning on refurbished labels to extract correct semantic relationships from embeddings across views and models. Experimental results show that their method outperforms existing approaches on both synthetic and real-world datasets.
Low GrooveSquid.com (original content) Low Difficulty Summary
Label noise is a big problem in deep learning. When the labels we use to train our models are noisy, it can make the models perform poorly. The authors of this paper found that some methods for fixing noisy labels can actually make things worse by introducing “Semantic Contamination”. This happens when the method tries to fix the label, but ends up linking classes together in a way that’s not helpful. The authors looked at one popular method called Robust LR and found that it has this problem. To solve this issue, they came up with a new approach that uses semi-supervised learning to get the correct relationships between classes from the data itself. This new method did better than existing approaches on both made-up and real-world datasets.

Keywords

» Artificial intelligence  » Deep learning  » Semi supervised