Loading Now

Summary of Hybrid Coordinate Descent For Efficient Neural Network Learning Using Line Search and Gradient Descent, by Yen-che Hsiao and Abhishek Dutta


Hybrid Coordinate Descent for Efficient Neural Network Learning Using Line Search and Gradient Descent

by Yen-Che Hsiao, Abhishek Dutta

First submitted to arxiv on: 2 Aug 2024

Categories

  • Main: Machine Learning (cs.LG)
  • Secondary: None

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
A novel coordinate descent algorithm is introduced that combines one-directional line search and gradient information for parameter updates, focusing on squared error loss functions. The update mechanism is determined by either the line search or gradient method, based on whether the modulus of the gradient surpasses a predefined threshold. This approach can be more efficient with larger threshold values. While the line search method may be slower than traditional gradient descent, its parallelizability reduces computational time. Experimental results on a 2-layer Rectified Linear Unit network using synthetic data demonstrate the impact of hyperparameters on convergence rates and efficiency.
Low GrooveSquid.com (original content) Low Difficulty Summary
This paper creates an efficient algorithm for updating parameters in machine learning models. It uses two different methods to make changes: one is called line search, and the other is based on gradients. The method they use chooses which one to use depending on how big the gradient is. This helps the algorithm work faster with bigger values. Even though the line search might be slower, it can do many calculations at the same time. They tested this on a special kind of neural network using fake data and showed that adjusting some settings makes the process run better.

Keywords

* Artificial intelligence  * Gradient descent  * Machine learning  * Neural network  * Synthetic data