Loading Now

Summary of Towards Understanding Why Label Smoothing Degrades Selective Classification and How to Fix It, by Guoxuan Xia et al.


Towards Understanding Why Label Smoothing Degrades Selective Classification and How to Fix It

by Guoxuan Xia, Olivier Laurent, Gianni Franchi, Christos-Savvas Bouganis

First submitted to arxiv on: 19 Mar 2024

Categories

  • Main: Machine Learning (cs.LG)
  • Secondary: Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
The paper investigates the impact of label smoothing (LS) on selective classification (SC), a task where models aim to reject misclassifications using their uncertainty. It finds that LS consistently degrades SC performance across various large-scale tasks and architectures, despite being an effective regularisation method for improving test accuracy. The study analyzes logit-level gradients and shows that LS suppresses the maximum logit value more when a prediction is likely to be correct, leading to a loss of uncertainty rank ordering between correct and incorrect predictions. This explains why strong classifiers underperform in SC tasks. To recover lost SC performance, the paper proposes post-hoc logit normalisation, which is empirically effective.
Low GrooveSquid.com (original content) Low Difficulty Summary
Label smoothing can actually make it harder for models to correctly identify misclassifications. This happens because LS changes how the model ranks its predictions, making it more likely to incorrectly reject correct predictions. To fix this issue, researchers have developed a new technique that normalises the model’s logit values after training. This helps the model better distinguish between correct and incorrect predictions.

Keywords

* Artificial intelligence  * Classification