Loading Now

Summary of Diminishing Return Of Value Expansion Methods in Model-based Reinforcement Learning, by Daniel Palenicek et al.


Diminishing Return of Value Expansion Methods in Model-Based Reinforcement Learning

by Daniel Palenicek, Michael Lutter, Joao Carvalho, Jan Peters

First submitted to arxiv on: 7 Mar 2023

Categories

  • Main: Machine Learning (cs.LG)
  • Secondary: None

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
Model-based reinforcement learning aims to increase sample efficiency by leveraging accurate dynamics models. However, limitations in model accuracy and compounding errors over modeled trajectories hinder its performance. This paper empirically investigates how improving learned dynamics models affects sample efficiency for model-based value expansion methods in continuous control problems. Our study shows that longer rollout horizons increase sample efficiency, but the gain decreases with each additional step. Moreover, increased model accuracy only marginally improves sample efficiency compared to learned models with identical horizons. Therefore, longer horizons and increased model accuracy yield diminishing returns in terms of sample efficiency. In contrast, model-free value expansion methods perform on-par with model-based methods, introducing no computational overhead. Our findings suggest that the limitation of model-based value expansion methods is not due to model accuracy but rather lies elsewhere.
Low GrooveSquid.com (original content) Low Difficulty Summary
Reinforcement learning tries to get better by using models of how things work. But, if these models are not accurate, it can’t really get much better. This paper looks at how making these models more accurate affects the results. They found that getting more accurate models helps a little bit, but not as much as you might think. It’s like taking one step forward and then going back to where you started. Another way of doing things, without using models at all, actually performs just as well. So, the problem is not with the models themselves, but rather something else entirely.

Keywords

* Artificial intelligence  * Reinforcement learning