Loading Now

Summary of Concurrent Training and Layer Pruning Of Deep Neural Networks, by Valentin Frank Ingmar Guenter and Athanasios Sideris


Concurrent Training and Layer Pruning of Deep Neural Networks

by Valentin Frank Ingmar Guenter, Athanasios Sideris

First submitted to arxiv on: 6 Jun 2024

Categories

  • Main: Machine Learning (cs.LG)
  • Secondary: None

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
The proposed algorithm is a novel approach to reducing the computational complexity of neural networks by identifying and eliminating irrelevant layers during early stages of training. Unlike traditional weight or filter-level pruning, this method targets layer pruning, which reduces sequential computation and allows for more efficient parallelization. The algorithm utilizes residual connections around nonlinear network sections to enable information flow after a layer is pruned. Built upon variational inference principles with Gaussian scale mixture priors on neural network weights, the approach enables significant cost savings during training and inference. This method learns the variational posterior distribution of scalar Bernoulli random variables multiplying layer weight matrices of nonlinear sections, similar to adaptive layer-wise dropout. To address challenges like premature pruning and lack of robustness, a “flattening” hyper-prior is adopted on prior parameters. The algorithm’s optimization problem is formulated using projected SGD and proven to converge to deterministic networks with posterior distribution at 0 or 1. Practical pruning conditions are derived from theoretical results. Evaluated on MNIST, CIFAR-10, ImageNet, LeNet, VGG16, and ResNet architectures, the proposed method achieves state-of-the-art performance for layer pruning at reduced computational cost.
Low GrooveSquid.com (original content) Low Difficulty Summary
The researchers developed a new way to make neural networks more efficient by removing unnecessary parts during training. They focused on “layers” rather than individual weights or filters, which helps speed up computations that are harder to parallelize. To achieve this, they used special connections and mathematical techniques inspired by variational inference. This approach not only saves time but also improves performance compared to existing methods.

Keywords

* Artificial intelligence  * Dropout  * Inference  * Neural network  * Optimization  * Pruning  * Resnet