Loading Now

Summary of Adaptive Horizon Actor-critic For Policy Learning in Contact-rich Differentiable Simulation, by Ignat Georgiev et al.


Adaptive Horizon Actor-Critic for Policy Learning in Contact-Rich Differentiable Simulation

by Ignat Georgiev, Krishnan Srinivasan, Jie Xu, Eric Heiden, Animesh Garg

First submitted to arxiv on: 28 May 2024

Categories

  • Main: Machine Learning (cs.LG)
  • Secondary: Artificial Intelligence (cs.AI)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
This paper explores Model-Free Reinforcement Learning (MFRL) and its limitations in continuous control tasks. MFRL has achieved success using the policy gradient theorem, but is hindered by high gradient variance from zeroth-order estimation. In contrast, First-Order Model-Based Reinforcement Learning (FO-MBRL) methods employ differentiable simulation for reduced variance gradients, yet struggle with sampling error in stiff dynamics scenarios like physical contact. The paper investigates this source of error and proposes Adaptive Horizon Actor-Critic (AHAC), an FO-MBRL algorithm that adapts the model-based horizon to avoid stiff dynamics. Experimental results show AHAC outperforming MFRL baselines, achieving 40% more reward across locomotion tasks, while efficiently scaling to high-dimensional control environments with improved wall-clock-time efficiency.
Low GrooveSquid.com (original content) Low Difficulty Summary
This paper looks at two ways to learn in situations where we need to make decisions based on uncertain outcomes. The first approach is called Model-Free Reinforcement Learning (MFRL) and it’s good for some things, but has a problem: it can be very slow and not very accurate because it needs to try lots of different actions. The second approach is First-Order Model-Based Reinforcement Learning (FO-MBRL) and it’s better at estimating the right action, but sometimes gets stuck in situations where the rules are complex or change suddenly. This paper tries to solve this problem by creating a new algorithm called Adaptive Horizon Actor-Critic (AHAC) that adapts to these changing situations. The results show that AHAC is much better than MFRL and can learn more quickly.

Keywords

» Artificial intelligence  » Reinforcement learning