Loading Now

Summary of The Implicit Bias Of Gradient Descent on Separable Multiclass Data, by Hrithik Ravi et al.


The Implicit Bias of Gradient Descent on Separable Multiclass Data

by Hrithik Ravi, Clayton Scott, Daniel Soudry, Yutong Wang

First submitted to arxiv on: 2 Nov 2024

Categories

  • Main: Machine Learning (cs.LG)
  • Secondary: Machine Learning (stat.ML)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
The paper investigates the phenomenon of implicit bias in machine learning models, where optimization-based training algorithms show a preference for simple estimators over more complex ones. The authors extend the existing theory on implicit bias from binary classification to multiclass classification using the Permutation Equivariant and Relative Margin-based (PERM) losses framework. This framework includes cross-entropy loss and other types of losses, allowing for a broader analysis of implicit bias in multiclass settings. The paper’s proof techniques mirror those of the binary case, demonstrating the PERM framework’s ability to bridge the gap between binary and multiclass classification.
Low GrooveSquid.com (original content) Low Difficulty Summary
The paper looks at how machine learning models can be biased towards simpler answers even if more complex ones are correct. This is called implicit bias. Right now, there are only a few studies on this topic for multi-choice questions (where you have three or more options). The authors want to fill this gap by using a new way of framing the problem that works with both two-choice and multi-choice questions.

Keywords

» Artificial intelligence  » Classification  » Cross entropy  » Machine learning  » Optimization