Summary of Disentangling the Causes Of Plasticity Loss in Neural Networks, by Clare Lyle et al.
Disentangling the Causes of Plasticity Loss in Neural Networks
by Clare Lyle, Zeyu Zheng, Khimya Khetarpal, Hado van Hasselt, Razvan Pascanu, James Martens, Will Dabney
First submitted to arxiv on: 29 Feb 2024
Categories
- Main: Machine Learning (cs.LG)
- Secondary: None
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary The paper addresses the assumption in neural network training that the data distribution remains stationary. When this assumption is violated, networks become unstable and sensitive to hyperparameters and random seeds. The loss of plasticity, where updating predictions becomes harder as training progresses, is a key factor driving this instability. While recent works have provided partial solutions, the paper decomposes loss of plasticity into independent mechanisms and shows that intervening on multiple mechanisms can result in robust learning algorithms. A combination of layer normalization and weight decay is effective in maintaining plasticity in synthetic nonstationary tasks and real-world reinforcement learning environments like the Arcade Learning Environment. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary The paper looks at how neural networks are trained, assuming that the data stays the same. But when this assumption isn’t true, networks can become unstable and change a lot depending on tiny details. The problem is that as they learn, it gets harder to update their predictions based on new information. While some recent research has helped with this issue, the paper wants to know how different ways of dealing with this problem work together. It finds that different mechanisms for losing plasticity can be separated and that by using multiple strategies at once, you can get more robust learning algorithms. The paper shows that combining two techniques, layer normalization and weight decay, is good at keeping neural networks flexible in synthetic and real-world scenarios. |
Keywords
* Artificial intelligence * Neural network * Reinforcement learning