Summary of Dred: Zero-shot Transfer in Reinforcement Learning Via Data-regularised Environment Design, by Samuel Garcin et al.

DRED: Zero-Shot Transfer in Reinforcement Learning via Data-Regularised Environment Design

by Samuel Garcin, James Doran, Shangmin Guo, Christopher G. Lucas, Stefano V. Albrecht

First submitted to arxiv on: 5 Feb 2024

GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty	Written by	Summary
High	Paper authors	High Difficulty Summary Read the original abstract here
Medium	GrooveSquid.com (original content)	Medium Difficulty Summary Autonomous agents trained using deep reinforcement learning (RL) often struggle to generalize to new environments, even when these environments share characteristics with those they’ve encountered during training. This work investigates how sampling individual environment instances affects the zero-shot generalization ability of RL agents. We find that prioritizing levels according to their value loss minimizes the mutual information between the agent’s internal representation and the set of training levels in the generated data, providing a novel theoretical justification for certain adaptive sampling strategies. We also explore unsupervised environment design methods, which assume control over level generation, and discover they can significantly shift the training distribution, leading to low zero-shot generalization performance. To prevent both overfitting and distributional shift, we introduce data-regularized environment design (DRED), generating levels using a generative model trained to approximate the ground truth distribution of initial set parameters. DRED achieves significant improvements in zero-shot generalization over adaptive level sampling strategies and UED methods.
Low	GrooveSquid.com (original content)	Low Difficulty Summary Autonomous agents can’t generalize well to new environments, even if they share characteristics with those they’ve seen before. This paper looks at how agents are trained using deep reinforcement learning (RL) and finds that the way individual environment instances are chosen affects how well the agent generalizes. The research also explores ways to design new environments for training RL agents and finds that current methods can make it harder for agents to generalize. To solve this problem, the paper proposes a new method called data-regularized environment design (DRED), which generates levels in a way that helps agents generalize better.

Keywords

* Artificial intelligence * Generalization * Generative model * Overfitting * Reinforcement learning * Unsupervised * Zero shot

DRED: Zero-Shot Transfer in Reinforcement Learning via Data-Regularised Environment Design

by Samuel Garcin, James Doran, Shangmin Guo, Christopher G. Lucas, Stefano V. Albrecht

Categories

GrooveSquid.com Paper Summaries

Keywords

Summary of Challenges in Variable Importance Ranking Under Correlation, by Annie Liang and Thomas Jemielita and Andy Liaw and Vladimir Svetnik and Lingkang Huang and Richard Baumgartner and Jason M. Klusowski

Summary of Online Feature Updates Improve Online (generalized) Label Shift Adaptation, by Ruihan Wu and Siddhartha Datta and Yi Su and Dheeraj Baby and Yu-xiang Wang and Kilian Q. Weinberger

Related Posts