Loading Now

Summary of Deeper or Wider: a Perspective From Optimal Generalization Error with Sobolev Loss, by Yahong Yang and Juncai He


Deeper or Wider: A Perspective from Optimal Generalization Error with Sobolev Loss

by Yahong Yang, Juncai He

First submitted to arxiv on: 31 Jan 2024

Categories

  • Main: Machine Learning (cs.LG)
  • Secondary: Numerical Analysis (math.NA); Machine Learning (stat.ML)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
In this research paper, scientists explore the optimal architecture for constructing a neural network, comparing deeper models with more layers versus wider models with fewer layers. The study finds that various factors such as the number of sample points, neural network parameters, and loss function regularity impact the optimal model choice. Specifically, models with more parameters tend to perform better using wider architectures, while models with many sample points and regular loss functions benefit from deeper networks. This theory is applied to solve partial differential equations using deep Ritz methods and physics-informed neural networks (PINNs), guiding the design of neural networks.
Low GrooveSquid.com (original content) Low Difficulty Summary
Scientists are trying to figure out what makes a good neural network architecture. They compared two kinds: ones with many layers (deep) or ones with fewer layers but more “stuff” inside (wide). The study found that things like how many examples you have, how complicated the math is, and whether the loss function is nice or not affect which kind of network works best. If you have a lot of stuff inside your network, it likes to be wide. But if you have lots of examples and the math is easy, a deep network might be better. This knowledge was used to solve tricky math problems using special kinds of neural networks.

Keywords

* Artificial intelligence  * Loss function  * Neural network