Loading Now

Summary of Harmodt: Harmony Multi-task Decision Transformer For Offline Reinforcement Learning, by Shengchao Hu et al.


HarmoDT: Harmony Multi-Task Decision Transformer for Offline Reinforcement Learning

by Shengchao Hu, Ziqing Fan, Li Shen, Ya Zhang, Yanfeng Wang, Dacheng Tao

First submitted to arxiv on: 28 May 2024

Categories

  • Main: Machine Learning (cs.LG)
  • Secondary: None

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
Offline multi-task reinforcement learning (MTRL) aims to develop a unified policy for diverse tasks without online interaction. Recent approaches use sequence modeling and the Transformer architecture’s scalability to leverage task similarities through parameter sharing. However, task content and complexity variations pose challenges in policy formulation, requiring judicious parameter sharing and gradient management for optimal performance. This work introduces Harmony Multi-Task Decision Transformer (HarmoDT), a novel solution that identifies an optimal harmony subspace of parameters for each task using bi-level optimization and meta-learning. The upper level learns a task-specific mask, while the inner level updates parameters to enhance the unified policy’s overall performance. Empirical evaluations on benchmarks demonstrate HarmoDT’s superiority, verifying its effectiveness.
Low GrooveSquid.com (original content) Low Difficulty Summary
This paper is about teaching computers to do many things without needing to practice each one separately. The computer needs to learn from experience and make good choices based on what it knows. There are different ways to solve this problem, but the approach in this paper uses a special type of architecture called the Transformer. It helps the computer understand how different tasks are related and can use that knowledge to improve its performance. The goal is to develop a single policy that works well for many different tasks. This paper presents a new solution called HarmoDT that achieves this goal by identifying the most important parts of the computer’s learning process.

Keywords

» Artificial intelligence  » Mask  » Meta learning  » Multi task  » Optimization  » Reinforcement learning  » Transformer