Summary of Do Not Trust What You Trust: Miscalibration in Semi-supervised Learning, by Shambhavi Mishra et al.
Do not trust what you trust: Miscalibration in Semi-supervised Learning
by Shambhavi Mishra, Balamurali Murugesan, Ismail Ben Ayed, Marco Pedersoli, Jose Dolz
First submitted to arxiv on: 22 Mar 2024
Categories
- Main: Machine Learning (cs.LG)
- Secondary: Computer Vision and Pattern Recognition (cs.CV)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary The paper proposes a method to improve semi-supervised learning (SSL) by addressing the issue of overconfident predictions. State-of-the-art SSL approaches use highly confident predictions as pseudo-labels to guide training on unlabeled samples. However, this strategy relies on uncertain estimates, which can lead to incorrect pseudo-labeling. The authors demonstrate that SSL methods based on pseudo-labels are significantly miscalibrated and provide a formal explanation of the minimization of the min-entropy, a lower bound of the Shannon entropy, as a potential cause for miscalibration. To alleviate this issue, they integrate a simple penalty term into the model, which prevents the network predictions from becoming overconfident. The proposed solution improves the calibration performance and discriminative power of SSL models on various image classification benchmarks. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary The paper solves a problem in semi-supervised learning (SSL) where predictions become too confident and not accurate. Right now, good SSL methods use confident predictions as fake labels to help train on unknown samples. But this can lead to wrong fake labels. The authors show that SSL methods based on fake labels are very bad at predicting how certain they are, and explain why this is happening. They then fix this problem by adding a simple penalty to the model that stops it from becoming too confident. This makes the predictions better and more accurate. |
Keywords
* Artificial intelligence * Image classification * Semi supervised