Loading Now

Summary of Deep Neural Network Benchmarks For Selective Classification, by Andrea Pugnana and Lorenzo Perini and Jesse Davis and Salvatore Ruggieri


Deep Neural Network Benchmarks for Selective Classification

by Andrea Pugnana, Lorenzo Perini, Jesse Davis, Salvatore Ruggieri

First submitted to arxiv on: 23 Jan 2024

Categories

  • Main: Machine Learning (cs.LG)
  • Secondary: Artificial Intelligence (cs.AI); Machine Learning (stat.ML)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
In this paper, researchers tackle the problem of trustworthy machine learning predictions in socially sensitive tasks by introducing a selection mechanism to allow models to abstain from making predictions when they’re uncertain. This selective classification framework balances the proportion of rejected predictions with the improvement in predictive performance on selected ones. Various approaches have been proposed, but there’s limited insight into their relative merits due to partial comparisons among methods and settings. To address this gap, the authors benchmark 18 baselines on a diverse set of 44 datasets, including image and tabular data, binary and multiclass tasks. The results show that no single approach dominates, and the best method depends on user objectives.
Low GrooveSquid.com (original content) Low Difficulty Summary
Machine learning models are getting used more in social situations, which means they need to be trustworthy. One way to do this is to let the model say “I’m not sure” when it’s really uncertain. This helps prevent mistakes that could have bad consequences. To make this work, you need a system that decides when to give an answer and when to stay quiet. Lots of different approaches have been tried, but no one has compared them all yet. In this paper, researchers do just that by testing 18 different methods on 44 different datasets. They look at things like how often the model makes mistakes, what kinds of mistakes it makes, and how well it does when faced with new information. The results show that there’s not one “best” method – it depends on what you want to use the model for.

Keywords

* Artificial intelligence  * Classification  * Machine learning