Loading Now

Summary of On the Sample Complexity Of One Hidden Layer Networks with Equivariance, Locality and Weight Sharing, by Arash Behboodi et al.


On the Sample Complexity of One Hidden Layer Networks with Equivariance, Locality and Weight Sharing

by Arash Behboodi, Gabriele Cesa

First submitted to arxiv on: 21 Nov 2024

Categories

  • Main: Machine Learning (cs.LG)
  • Secondary: Statistics Theory (math.ST); Machine Learning (stat.ML)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
The abstract discusses how different design choices in neural networks contribute to their sample efficiency. Weight sharing, equivariance, and local filters are believed to be important, but it’s unclear how each one affects the generalization error. The authors use statistical learning theory to provide insight into this question by characterizing the relative impact of each choice on the sample complexity. They obtain bounds for single hidden layer networks with certain activation functions, as well as for max-pooling and multi-layer networks. The results show that non-equivariant weight-sharing can have similar generalization bounds as equivariant ones, while locality has generalization benefits but is traded off against expressivity. Experiments are conducted to highlight consistent trends.
Low GrooveSquid.com (original content) Low Difficulty Summary
Neural networks have special designs that help them learn quickly from few examples. But how do these designs affect how well they generalize? Researchers used a math framework called statistical learning theory to figure out the answers. They found that some design choices can make neural networks better at generalizing, like using local filters. Other design choices, like weight sharing, don’t seem to matter as much. The results show that making neural networks less special in certain ways can actually help them generalize better.

Keywords

* Artificial intelligence  * Generalization