Summary of Opening the Black Box: Predicting the Trainability Of Deep Neural Networks with Reconstruction Entropy, by Yanick Thurn et al.
Opening the Black Box: predicting the trainability of deep neural networks with reconstruction entropy
by Yanick Thurn, Ro Jefferson, Johanna Erdmenger
First submitted to arxiv on: 13 Jun 2024
Categories
- Main: Machine Learning (cs.LG)
- Secondary: Disordered Systems and Neural Networks (cond-mat.dis-nn); High Energy Physics – Theory (hep-th); Machine Learning (stat.ML)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary The proposed method aims to predict the trainable regime in parameter space for deep feedforward neural networks (DNNs) by reconstructing inputs from subsequent activation layers using a cascade of single-layer auxiliary networks. This approach enables predicting trainability after just one epoch of training, reducing overall training time on various datasets including MNIST, CIFAR10, FashionMNIST, and white noise. The method computes relative entropy between reconstructed images and original inputs to probe information loss, which is sensitive to the phase behavior of the network. It generalizes to residual neural networks (ResNets) and convolutional neural networks (CNNs). By displaying changes made on input data at each layer, the method illustrates a network’s decision-making process. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary The researchers developed a new way to predict when a neural network will be trainable. They used a special kind of auxiliary network that can reconstruct what the original input was, based on what the neural network does after training. This helps figure out if the neural network is working well or not. The method works for different types of neural networks and can even show how the network changes the input data as it’s being processed. |
Keywords
* Artificial intelligence * Neural network