Summary of How the Level Sampling Process Impacts Zero-shot Generalisation in Deep Reinforcement Learning, by Samuel Garcin et al.

How the level sampling process impacts zero-shot generalisation in deep reinforcement learning

by Samuel Garcin, James Doran, Shangmin Guo, Christopher G. Lucas, Stefano V. Albrecht

First submitted to arxiv on: 5 Oct 2023

GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty	Written by	Summary
High	Paper authors	High Difficulty Summary Read the original abstract here
Medium	GrooveSquid.com (original content)	Medium Difficulty Summary This paper investigates the limitations of autonomous agents trained via deep reinforcement learning (RL) when generalizing to new environments. The authors find that a non-uniform sampling strategy for individual environment instances affects zero-shot generalization (ZSG) ability, with overfitting and over-generalization as potential failure modes. They introduce mutual information (MI) between the agent’s internal representation and training levels, which is well-correlated to instance overfitting. Adaptive sampling strategies prioritizing levels based on value loss are more effective at maintaining lower MI. The authors also explore unsupervised environment design (UED) methods, which adaptively generate new training levels but shift the distribution, leading to worse ZSG performance. To address this, they introduce self-supervised environment design (SSED), which generates levels using a variational autoencoder and reduces MI while minimizing the distribution shift, resulting in statistically significant improvements in ZSG.
Low	GrooveSquid.com (original content)	Low Difficulty Summary This research paper looks at why computers trained to make decisions on their own struggle to learn from new situations. The authors tested different ways of training these computers and found that some methods are better than others at letting them generalize to new environments. They also created a new method called self-supervised environment design, which helps computers learn more effectively without needing additional information or supervision.

Keywords

* Artificial intelligence * Generalization * Overfitting * Reinforcement learning * Self supervised * Unsupervised * Variational autoencoder * Zero shot

How the level sampling process impacts zero-shot generalisation in deep reinforcement learning

by Samuel Garcin, James Doran, Shangmin Guo, Christopher G. Lucas, Stefano V. Albrecht

Categories

GrooveSquid.com Paper Summaries

Keywords

Summary of Who’s Harry Potter? Approximate Unlearning in Llms, by Ronen Eldan and Mark Russinovich

Summary of Continual Test-time Domain Adaptation Via Dynamic Sample Selection, by Yanshuo Wang et al.

Related Posts