Summary of Towards Understanding Why Label Smoothing Degrades Selective Classification and How to Fix It, by Guoxuan Xia et al.
Towards Understanding Why Label Smoothing Degrades Selective Classification and How to Fix It
by Guoxuan Xia, Olivier Laurent, Gianni Franchi, Christos-Savvas Bouganis
First submitted to arxiv on: 19 Mar 2024
Categories
- Main: Machine Learning (cs.LG)
- Secondary: Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary The paper investigates the impact of label smoothing (LS) on selective classification (SC), a task where models aim to reject misclassifications using their uncertainty. It finds that LS consistently degrades SC performance across various large-scale tasks and architectures, despite being an effective regularisation method for improving test accuracy. The study analyzes logit-level gradients and shows that LS suppresses the maximum logit value more when a prediction is likely to be correct, leading to a loss of uncertainty rank ordering between correct and incorrect predictions. This explains why strong classifiers underperform in SC tasks. To recover lost SC performance, the paper proposes post-hoc logit normalisation, which is empirically effective. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary Label smoothing can actually make it harder for models to correctly identify misclassifications. This happens because LS changes how the model ranks its predictions, making it more likely to incorrectly reject correct predictions. To fix this issue, researchers have developed a new technique that normalises the model’s logit values after training. This helps the model better distinguish between correct and incorrect predictions. |
Keywords
* Artificial intelligence * Classification