Loading Now

Summary of Plandq: Hierarchical Plan Orchestration Via D-conductor and Q-performer, by Chang Chen et al.


PlanDQ: Hierarchical Plan Orchestration via D-Conductor and Q-Performer

by Chang Chen, Junyeob Baek, Fei Deng, Kenji Kawaguchi, Caglar Gulcehre, Sungjin Ahn

First submitted to arxiv on: 10 Jun 2024

Categories

  • Main: Machine Learning (cs.LG)
  • Secondary: Artificial Intelligence (cs.AI)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
In this research paper, the authors aim to bridge the gap between offline reinforcement learning (RL) models designed for short-horizon tasks and those that excel in long-horizon scenarios. They propose a hierarchical planner called PlanDQ, which combines a diffusion-based planner, D-Conductor, with a Q-learning based approach, Q-Performer, to achieve superior performance on various benchmark tasks, including the D4RL continuous control benchmark and AntMaze, Kitchen, and Calvin as long-horizon tasks. The authors’ novel algorithm addresses the challenges of sparse-reward, long-horizon tasks by solving credit assignment and extrapolation errors.
Low GrooveSquid.com (original content) Low Difficulty Summary
Offline reinforcement learning is a growing field that aims to train AI agents without interacting with an environment. This paper presents PlanDQ, a new algorithm designed for offline RL that can handle both short- and long-horizon tasks. The authors show that their approach outperforms other methods on various benchmarks.

Keywords

» Artificial intelligence  » Diffusion  » Reinforcement learning