Summary of Curiosity & Entropy Driven Unsupervised Rl in Multiple Environments, by Shaurya Dewan et al.
Curiosity & Entropy Driven Unsupervised RL in Multiple Environments
by Shaurya Dewan, Anisha Jain, Zoe LaLena, Lifan Yu
First submitted to arxiv on: 8 Jan 2024
Categories
- Main: Machine Learning (cs.LG)
- Secondary: Artificial Intelligence (cs.AI)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary The proposed alpha-MEPOL method tackles unsupervised reinforcement learning across multiple environments by pre-training a task-agnostic exploration policy using interactions from an entire environment class. The authors improve upon this work by introducing five new modifications, including sampling trajectories using entropy-based probability distributions, dynamic alpha, higher KL divergence threshold, curiosity-driven exploration, and percentile sampling on curiosity. Notably, dynamic alpha and higher KL divergence threshold significantly improved performance compared to the baseline, while PDF-sampling failed to provide any improvement due to being equivalent to the baseline in small sample spaces. Curiosity-driven exploration enhanced learning by encouraging agents to seek diverse experiences and explore the unknown in high-dimensional environments. However, its benefits are limited in low-dimensional and simpler environments where exploration possibilities are constrained. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary The paper proposes a way for machines to learn new skills without being taught specifically what to do. This is called unsupervised reinforcement learning. The authors improve an existing method by adding some new ideas. They test these ideas on different types of problems and find that some of them work better than others. One idea, called dynamic alpha, helps the machine learn faster and more accurately. Another idea, curiosity-driven exploration, encourages the machine to try new things and explore its surroundings. Overall, this research can help us create smarter machines that can solve complex problems. |
Keywords
* Artificial intelligence * Probability * Reinforcement learning * Unsupervised