Summary of Tackling Noisy Labels with Network Parameter Additive Decomposition, by Jingyi Wang et al.
Tackling Noisy Labels with Network Parameter Additive Decomposition
by Jingyi Wang, Xiaobo Xia, Long Lan, Xinghao Wu, Jun Yu, Wenjing Yang, Bo Han, Tongliang Liu
First submitted to arxiv on: 20 Mar 2024
Categories
- Main: Machine Learning (cs.LG)
- Secondary: None
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary A simple yet effective approach is proposed to tackle overfitting caused by noisy labels in deep networks. The memorization effect of these networks is exploited, where they prioritize memorizing clean data before mislabeled data. Early stopping is used to combat noisy labels, but it cannot distinguish between the two types of data, leading to continued overfitting. To decouple this memorization, additive decomposition is applied to network parameters, separating those that memorize clean data (σ) from those that memorize mislabeled data (γ). Updates for σ are encouraged in early training and then discouraged later on to reduce interference from mislabeled data, while γ updates are the opposite. The superior performance of this approach is confirmed through extensive experiments on simulated and real-world benchmarks. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary A new method helps deep networks learn better by dealing with noisy labels. Noisy labels can make a network think it’s learned something when really it just memorized mistakes. A simple trick called early stopping tries to fix this, but it doesn’t work well. This paper proposes a way to separate what the network is learning from clean data versus mislabeled data. It does this by breaking down the network’s parameters into two groups: one that learns from clean data and another that learns from mislabeled data. By doing this, the network can focus on learning from the right things and avoid mistakes. The results show that this approach works well in practice. |
Keywords
* Artificial intelligence * Early stopping * Overfitting