Summary of Adam on Local Time: Addressing Nonstationarity in Rl with Relative Adam Timesteps, by Benjamin Ellis et al.

Adam on Local Time: Addressing Nonstationarity in RL with Relative Adam Timesteps

by Benjamin Ellis, Matthew T. Jackson, Andrei Lupu, Alexander D. Goldie, Mattie Fellows, Shimon Whiteson, Jakob Foerster

First submitted to arxiv on: 22 Dec 2024

GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty	Written by	Summary
High	Paper authors	High Difficulty Summary Read the original abstract here
Medium	GrooveSquid.com (original content)	Medium Difficulty Summary The paper presents a novel approach to addressing nonstationarity in reinforcement learning (RL) by adapting the widely used Adam optimizer. The authors analyze the impact of nonstationary gradient magnitude on Adam’s update size, demonstrating that changes in target networks can lead to large updates and sub-optimal performance. To address this issue, they introduce Adam-Rel, which uses local timesteps within an epoch instead of global timesteps. This modification avoids large updates and reduces learning rate annealing when gradient magnitudes increase. The authors evaluate Adam-Rel in on-policy and off-policy RL experiments on Atari and Craftax datasets, demonstrating improved performance.
Low	GrooveSquid.com (original content)	Low Difficulty Summary The paper helps improve reinforcement learning (RL) by making the widely used Adam optimizer work better in changing environments. Normally, we use techniques developed for supervised learning with RL, but this can cause problems. The authors found that changes in target networks can make updates too big and hurt performance. They created a new version of Adam called Adam-Rel, which uses a different timing method to avoid these big updates. This makes RL perform better on some tasks.

Keywords

* Artificial intelligence * Reinforcement learning * Supervised

Adam on Local Time: Addressing Nonstationarity in RL with Relative Adam Timesteps

by Benjamin Ellis, Matthew T. Jackson, Andrei Lupu, Alexander D. Goldie, Mattie Fellows, Shimon Whiteson, Jakob Foerster

Categories

GrooveSquid.com Paper Summaries

Keywords

Summary of Grams: Gradient Descent with Adaptive Momentum Scaling, by Yang Cao et al.

Summary of Fair and Accurate Regression: Strong Formulations and Algorithms, by Anna Deza et al.

Related Posts