Summary of Mixtures Of Experts Unlock Parameter Scaling For Deep Rl, by Johan Obando-ceron et al.
Mixtures of Experts Unlock Parameter Scaling for Deep RL
by Johan Obando-Ceron, Ghada Sokar, Timon Willi, Clare Lyle, Jesse Farebrother, Jakob Foerster, Gintare Karolina Dziugaite, Doina Precup, Pablo Samuel Castro
First submitted to arxiv on: 13 Feb 2024
Categories
- Main: Machine Learning (cs.LG)
- Secondary: Artificial Intelligence (cs.AI)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary The recent surge in supervised learning models is largely driven by empirical scaling laws, where model performance scales with size. In contrast, reinforcement learning domains lack such scaling laws, despite increasing parameter counts often harming final performance. This paper investigates the incorporation of Mixture-of-Expert (MoE) modules, particularly Soft MoEs, into value-based networks to create more scalable models. The results demonstrate substantial performance increases across various training regimes and model sizes, providing strong empirical evidence towards developing scaling laws for reinforcement learning. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary Reinforcement learning is a way computers learn from mistakes. Usually, when we make the computer smarter by adding more information, it gets better at doing things on its own. But in some cases, making the computer smarter actually makes it worse! This paper explores how to fix this problem by combining different ways of thinking (like experts) inside the computer’s brain. They tested this idea and found that it works really well across many situations, which is a big step forward for making computers learn from mistakes. |
Keywords
* Artificial intelligence * Reinforcement learning * Scaling laws * Supervised