Loading Now

Summary of How the Level Sampling Process Impacts Zero-shot Generalisation in Deep Reinforcement Learning, by Samuel Garcin et al.


How the level sampling process impacts zero-shot generalisation in deep reinforcement learning

by Samuel Garcin, James Doran, Shangmin Guo, Christopher G. Lucas, Stefano V. Albrecht

First submitted to arxiv on: 5 Oct 2023

Categories

  • Main: Machine Learning (cs.LG)
  • Secondary: Artificial Intelligence (cs.AI)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
This paper investigates the limitations of autonomous agents trained via deep reinforcement learning (RL) when generalizing to new environments. The authors find that a non-uniform sampling strategy for individual environment instances affects zero-shot generalization (ZSG) ability, with overfitting and over-generalization as potential failure modes. They introduce mutual information (MI) between the agent’s internal representation and training levels, which is well-correlated to instance overfitting. Adaptive sampling strategies prioritizing levels based on value loss are more effective at maintaining lower MI. The authors also explore unsupervised environment design (UED) methods, which adaptively generate new training levels but shift the distribution, leading to worse ZSG performance. To address this, they introduce self-supervised environment design (SSED), which generates levels using a variational autoencoder and reduces MI while minimizing the distribution shift, resulting in statistically significant improvements in ZSG.
Low GrooveSquid.com (original content) Low Difficulty Summary
This research paper looks at why computers trained to make decisions on their own struggle to learn from new situations. The authors tested different ways of training these computers and found that some methods are better than others at letting them generalize to new environments. They also created a new method called self-supervised environment design, which helps computers learn more effectively without needing additional information or supervision.

Keywords

* Artificial intelligence  * Generalization  * Overfitting  * Reinforcement learning  * Self supervised  * Unsupervised  * Variational autoencoder  * Zero shot