Summary of Task-aware Harmony Multi-task Decision Transformer For Offline Reinforcement Learning, by Ziqing Fan et al.
Task-Aware Harmony Multi-Task Decision Transformer for Offline Reinforcement Learning
by Ziqing Fan, Shengchao Hu, Yuhang Zhou, Li Shen, Ya Zhang, Yanfeng Wang, Dacheng Tao
First submitted to arxiv on: 2 Nov 2024
Categories
- Main: Machine Learning (cs.LG)
- Secondary: Artificial Intelligence (cs.AI)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary A unified policy for diverse tasks without online interaction is developed through offline multi-task reinforcement learning (MTRL). The Transformer architecture’s scalability and parameter sharing are leveraged to exploit task similarities. However, variations in task content and complexity pose challenges in policy formulation, requiring judicious parameter sharing and gradient management for optimal performance. A novel solution, Harmony Multi-Task Decision Transformer (HarmoDT), is introduced to identify an optimal harmony subspace of parameters for each task through a bi-level optimization problem within a meta-learning framework. The inner level updates parameters to improve overall policy performance, while the upper level learns masks to define the harmony subspace. To eliminate task identifiers, a group-wise variant (G-HarmoDT) is designed that clusters tasks into coherent groups based on gradient information and utilizes a gating network to determine task identifiers during inference. Empirical evaluations across various benchmarks highlight the superiority of our approach, demonstrating its effectiveness in multi-task settings with specific improvements of 8% gain in task-provided settings, 5% in task-agnostic settings, and 10% in unseen settings. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary Offline reinforcement learning helps robots learn multiple tasks without needing to interact with the environment. The Harmony Multi-Task Decision Transformer is a new way to do this. It finds the best set of parameters for each task by solving an optimization problem. This lets it work well even when tasks are very different. To make it even better, it can group similar tasks together and decide which task to do based on how much information it has. Tests show that this approach is 8% better than usual in some cases and 10% better in others. |
Keywords
» Artificial intelligence » Inference » Meta learning » Multi task » Optimization » Reinforcement learning » Transformer