Summary of Adam on Local Time: Addressing Nonstationarity in Rl with Relative Adam Timesteps, by Benjamin Ellis et al.
Adam on Local Time: Addressing Nonstationarity in RL with Relative Adam Timesteps
by Benjamin Ellis, Matthew T. Jackson, Andrei Lupu, Alexander D. Goldie, Mattie Fellows, Shimon Whiteson, Jakob Foerster
First submitted to arxiv on: 22 Dec 2024
Categories
- Main: Machine Learning (cs.LG)
- Secondary: Optimization and Control (math.OC)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary The paper presents a novel approach to addressing nonstationarity in reinforcement learning (RL) by adapting the widely used Adam optimizer. The authors analyze the impact of nonstationary gradient magnitude on Adam’s update size, demonstrating that changes in target networks can lead to large updates and sub-optimal performance. To address this issue, they introduce Adam-Rel, which uses local timesteps within an epoch instead of global timesteps. This modification avoids large updates and reduces learning rate annealing when gradient magnitudes increase. The authors evaluate Adam-Rel in on-policy and off-policy RL experiments on Atari and Craftax datasets, demonstrating improved performance. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary The paper helps improve reinforcement learning (RL) by making the widely used Adam optimizer work better in changing environments. Normally, we use techniques developed for supervised learning with RL, but this can cause problems. The authors found that changes in target networks can make updates too big and hurt performance. They created a new version of Adam called Adam-Rel, which uses a different timing method to avoid these big updates. This makes RL perform better on some tasks. |
Keywords
* Artificial intelligence * Reinforcement learning * Supervised