Loading Now

Summary of Opening the Black Box: Predicting the Trainability Of Deep Neural Networks with Reconstruction Entropy, by Yanick Thurn et al.


Opening the Black Box: predicting the trainability of deep neural networks with reconstruction entropy

by Yanick Thurn, Ro Jefferson, Johanna Erdmenger

First submitted to arxiv on: 13 Jun 2024

Categories

  • Main: Machine Learning (cs.LG)
  • Secondary: Disordered Systems and Neural Networks (cond-mat.dis-nn); High Energy Physics – Theory (hep-th); Machine Learning (stat.ML)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
The proposed method aims to predict the trainable regime in parameter space for deep feedforward neural networks (DNNs) by reconstructing inputs from subsequent activation layers using a cascade of single-layer auxiliary networks. This approach enables predicting trainability after just one epoch of training, reducing overall training time on various datasets including MNIST, CIFAR10, FashionMNIST, and white noise. The method computes relative entropy between reconstructed images and original inputs to probe information loss, which is sensitive to the phase behavior of the network. It generalizes to residual neural networks (ResNets) and convolutional neural networks (CNNs). By displaying changes made on input data at each layer, the method illustrates a network’s decision-making process.
Low GrooveSquid.com (original content) Low Difficulty Summary
The researchers developed a new way to predict when a neural network will be trainable. They used a special kind of auxiliary network that can reconstruct what the original input was, based on what the neural network does after training. This helps figure out if the neural network is working well or not. The method works for different types of neural networks and can even show how the network changes the input data as it’s being processed.

Keywords

* Artificial intelligence  * Neural network