Loading Now

Summary of Stochastic Taylor Derivative Estimator: Efficient Amortization For Arbitrary Differential Operators, by Zekun Shi et al.


Stochastic Taylor Derivative Estimator: Efficient amortization for arbitrary differential operators

by Zekun Shi, Zheyuan Hu, Min Lin, Kenji Kawaguchi

First submitted to arxiv on: 27 Nov 2024

Categories

  • Main: Machine Learning (cs.LG)
  • Secondary: None

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
The paper presents an efficient method for optimizing neural networks with loss functions containing high-dimensional and high-order differential operators. The optimization process is expensive due to the polynomial scaling of the derivative tensor size and the exponential scaling in the computation graph. To address this, previous works have amortized the computation over the optimization process via randomization or used high-order auto-differentiation (AD) for univariate functions. This paper shows how to efficiently perform arbitrary contraction of the derivative tensor for multivariate functions by constructing input tangents for high-order AD. The method is applied to Physics-Informed Neural Networks (PINNs), achieving a >1000x speed-up and >30x memory reduction over randomization with first-order AD. This opens the possibility of using high-order differential operators in large-scale problems, such as solving 1-million-dimensional PDEs in 8 minutes on a single NVIDIA A100 GPU.
Low GrooveSquid.com (original content) Low Difficulty Summary
The paper is about making it easier to use neural networks for complex calculations that involve many variables and steps. The problem is that these calculations are very time-consuming because they need to consider so many different possibilities. The authors present a new way of doing this calculation that is much faster and uses less memory. This allows them to solve problems with millions of variables in just 8 minutes on a powerful computer.

Keywords

» Artificial intelligence  » Optimization