Summary of Diminishing Return Of Value Expansion Methods in Model-based Reinforcement Learning, by Daniel Palenicek et al.

Diminishing Return of Value Expansion Methods in Model-Based Reinforcement Learning

by Daniel Palenicek, Michael Lutter, Joao Carvalho, Jan Peters

First submitted to arxiv on: 7 Mar 2023

GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty	Written by	Summary
High	Paper authors	High Difficulty Summary Read the original abstract here
Medium	GrooveSquid.com (original content)	Medium Difficulty Summary Model-based reinforcement learning aims to increase sample efficiency by leveraging accurate dynamics models. However, limitations in model accuracy and compounding errors over modeled trajectories hinder its performance. This paper empirically investigates how improving learned dynamics models affects sample efficiency for model-based value expansion methods in continuous control problems. Our study shows that longer rollout horizons increase sample efficiency, but the gain decreases with each additional step. Moreover, increased model accuracy only marginally improves sample efficiency compared to learned models with identical horizons. Therefore, longer horizons and increased model accuracy yield diminishing returns in terms of sample efficiency. In contrast, model-free value expansion methods perform on-par with model-based methods, introducing no computational overhead. Our findings suggest that the limitation of model-based value expansion methods is not due to model accuracy but rather lies elsewhere.
Low	GrooveSquid.com (original content)	Low Difficulty Summary Reinforcement learning tries to get better by using models of how things work. But, if these models are not accurate, it can’t really get much better. This paper looks at how making these models more accurate affects the results. They found that getting more accurate models helps a little bit, but not as much as you might think. It’s like taking one step forward and then going back to where you started. Another way of doing things, without using models at all, actually performs just as well. So, the problem is not with the models themselves, but rather something else entirely.

Keywords

* Artificial intelligence * Reinforcement learning

Diminishing Return of Value Expansion Methods in Model-Based Reinforcement Learning

by Daniel Palenicek, Michael Lutter, Joao Carvalho, Jan Peters

Categories

GrooveSquid.com Paper Summaries

Keywords

Summary of Sober: Highly Parallel Bayesian Optimization and Bayesian Quadrature Over Discrete and Mixed Spaces, by Masaki Adachi et al.

Summary of Self-supervised Temporal Analysis Of Spatiotemporal Data, by Yi Cao and Swetava Ganguli and Vipul Pandey

Related Posts