Loading Now

Summary of Kernel Vs. Kernel: Exploring How the Data Structure Affects Neural Collapse, by Vignesh Kothapalli et al.


Kernel vs. Kernel: Exploring How the Data Structure Affects Neural Collapse

by Vignesh Kothapalli, Tom Tirer

First submitted to arxiv on: 4 Jun 2024

Categories

  • Main: Machine Learning (cs.LG)
  • Secondary: Artificial Intelligence (cs.AI); Information Theory (cs.IT); Machine Learning (stat.ML)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
This paper delves into the Neural Collapse (NC) phenomenon in neural network (NN) classifiers. Recent research has shown that NC occurs when training NNS beyond zero error, characterized by a decrease in within-class variability of the deepest features, known as NC1. Theoretical works on NC typically employ simplified unconstrained feature models (UFMs), masking any data effect on the extent of collapse. This study provides a kernel-based analysis that overcomes this limitation. The authors establish expressions for the traces of within- and between-class covariance matrices using given kernel functions. They then focus on kernels associated with shallow NNS, considering the Neural Tangent Kernel (NTK) and the Neural Gaussian Process kernel (NNGP). Surprisingly, they find that the NTK does not represent more collapsed features than the NNGP for prototypical data models. As NC emerges from training, they propose an adaptive kernel that generalizes NNGP to model the feature mapping learned from the training data. By contrasting their NC1 analysis for these two kernels, they gain insights into the effect of data distribution on the extent of collapse, aligning with practical NN training behaviors. This study’s findings have implications for understanding Neural Collapse and its relationship to data distributions, which can inform strategies for optimizing NN performance.
Low GrooveSquid.com (original content) Low Difficulty Summary
Neural networks are very good at recognizing patterns in pictures or sounds. But sometimes, they get stuck and stop learning new things. This is called Neural Collapse (NC). Researchers want to understand why this happens and how it affects the network’s behavior. Most studies use simplified models that don’t account for real-world data. This paper takes a different approach by using kernel-based analysis to study NC. The authors start by defining what Neural Collapse is, which is when the network stops learning new things and gets stuck in a pattern. They then explore two types of kernels: one associated with the network at initialization (NNGP) and another that shows how the network changes during training (NTK). Interestingly, they find that these two kernels don’t show much difference in their behavior. By using an adaptive kernel that takes into account real-world data, the authors gain insights into how Neural Collapse affects the network’s performance. This study can help researchers understand why Neural Collapse happens and how to improve neural networks so they keep learning new things.

Keywords

» Artificial intelligence  » Neural network