Summary of Plandq: Hierarchical Plan Orchestration Via D-conductor and Q-performer, by Chang Chen et al.

PlanDQ: Hierarchical Plan Orchestration via D-Conductor and Q-Performer

by Chang Chen, Junyeob Baek, Fei Deng, Kenji Kawaguchi, Caglar Gulcehre, Sungjin Ahn

First submitted to arxiv on: 10 Jun 2024

GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty	Written by	Summary
High	Paper authors	High Difficulty Summary Read the original abstract here
Medium	GrooveSquid.com (original content)	Medium Difficulty Summary In this research paper, the authors aim to bridge the gap between offline reinforcement learning (RL) models designed for short-horizon tasks and those that excel in long-horizon scenarios. They propose a hierarchical planner called PlanDQ, which combines a diffusion-based planner, D-Conductor, with a Q-learning based approach, Q-Performer, to achieve superior performance on various benchmark tasks, including the D4RL continuous control benchmark and AntMaze, Kitchen, and Calvin as long-horizon tasks. The authors’ novel algorithm addresses the challenges of sparse-reward, long-horizon tasks by solving credit assignment and extrapolation errors.
Low	GrooveSquid.com (original content)	Low Difficulty Summary Offline reinforcement learning is a growing field that aims to train AI agents without interacting with an environment. This paper presents PlanDQ, a new algorithm designed for offline RL that can handle both short- and long-horizon tasks. The authors show that their approach outperforms other methods on various benchmarks.

Keywords

* Artificial intelligence * Diffusion * Reinforcement learning

PlanDQ: Hierarchical Plan Orchestration via D-Conductor and Q-Performer

by Chang Chen, Junyeob Baek, Fei Deng, Kenji Kawaguchi, Caglar Gulcehre, Sungjin Ahn

Categories

GrooveSquid.com Paper Summaries

Keywords

Summary of Coprocessor Actor Critic: a Model-based Reinforcement Learning Approach For Adaptive Brain Stimulation, by Michelle Pan et al.

Summary of Flexible Parametric Inference For Space-time Hawkes Processes, by Emilia Siviero et al.

Related Posts