Loading Now

Summary of Decoupling Feature Extraction and Classification Layers For Calibrated Neural Networks, by Mikkel Jordahn and Pablo M. Olmos


Decoupling Feature Extraction and Classification Layers for Calibrated Neural Networks

by Mikkel Jordahn, Pablo M. Olmos

First submitted to arxiv on: 2 May 2024

Categories

  • Main: Machine Learning (cs.LG)
  • Secondary: Machine Learning (stat.ML)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
Deep learning models have shown great promise in many classification applications, but they can be poorly calibrated when over-parametrized. This is a problem because it’s important to have accurate and reliable predictions, especially in safety-critical areas like healthcare. One way to improve model calibration without sacrificing accuracy is by decoupling the training of feature extraction layers and classification layers in over-parametrized architectures such as Wide Residual Networks (WRN) and Visual Transformers (ViT). This approach can significantly improve model calibration while retaining accuracy at a low training cost. Additionally, placing a Gaussian prior on the last hidden layer outputs of a DNN and training the model variationally in the classification stage can further improve calibration. These methods have been shown to improve calibration across ViT and WRN architectures for several image classification benchmark datasets.
Low GrooveSquid.com (original content) Low Difficulty Summary
Deep learning models are really good at classifying things, but sometimes they make mistakes because they’re not very good at saying how sure they are about their answers. This is important in healthcare because we need to be able to trust the models that help us diagnose patients. Researchers found a way to make these models better by separating the parts of the model that learn features and make predictions. They also tried putting special limits on the model’s outputs to keep it from getting too confident or too uncertain. These tricks worked really well for several different image classification tasks.

Keywords

» Artificial intelligence  » Classification  » Deep learning  » Feature extraction  » Image classification  » Vit