Summary of Towards the Mitigation Of Confirmation Bias in Semi-supervised Learning: a Debiased Training Perspective, by Yu Wang et al.
Towards the Mitigation of Confirmation Bias in Semi-supervised Learning: a Debiased Training Perspective
by Yu Wang, Yuxuan Yin, Peng Li
First submitted to arxiv on: 26 Sep 2024
Categories
- Main: Machine Learning (cs.LG)
- Secondary: Machine Learning (stat.ML)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary Semi-supervised learning (SSL) commonly exhibits confirmation bias, where models disproportionately favor certain classes, leading to errors in predicted pseudo labels that accumulate under a self-training paradigm. Unlike supervised settings, which benefit from a rich, static data distribution, SSL inherently lacks mechanisms to correct this self-reinforced bias, necessitating debiased interventions at each training step. Although the generation of debiased pseudo labels has been extensively studied, their effective utilization remains underexplored. To address these challenges, we introduce TaMatch, a unified framework for debiased training in SSL. TaMatch employs a scaling ratio derived from both a prior target distribution and the model’s learning status to estimate and correct bias at each training step. This ratio adjusts the raw predictions on unlabeled data to produce debiased pseudo labels. In the utilization phase, these labels are differently weighted according to their predicted class, enhancing training equity and minimizing class bias. Empirical evaluations show that TaMatch significantly outperforms existing state-of-the-art methods across a range of challenging image classification tasks. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary Semi-supervised learning (SSL) can be biased, making mistakes in predicting pseudo labels when trained on itself. Unlike regular supervised learning, SSL lacks ways to fix this bias. The paper introduces TaMatch, a method that helps correct bias by adjusting predictions based on the model’s progress and the target distribution. This makes training more fair and accurate. |
Keywords
» Artificial intelligence » Image classification » Self training » Semi supervised » Supervised