Summary of Solving Continual Offline Reinforcement Learning with Decision Transformer, by Kaixin Huang et al.
Solving Continual Offline Reinforcement Learning with Decision Transformer
by Kaixin Huang, Li Shen, Chen Zhao, Chun Yuan, Dacheng Tao
First submitted to arxiv on: 16 Jan 2024
Categories
- Main: Machine Learning (cs.LG)
- Secondary: Artificial Intelligence (cs.AI)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
| Summary difficulty | Written by | Summary | 
|---|---|---|
| High | Paper authors | High Difficulty Summary Read the original abstract here | 
| Medium | GrooveSquid.com (original content) | Medium Difficulty Summary This paper proposes novel approaches to continuous offline reinforcement learning (CORL), which enables agents to learn multiple tasks from static datasets without forgetting prior tasks. The existing Actor-Critic based methods suffer from distribution shifts, low efficiency, and weak knowledge-sharing. In contrast, the Decision Transformer (DT) paradigm shows promise in addressing these issues, offering advantages in learning efficiency, distribution shift mitigation, and zero-shot generalization. However, DT also exacerbates the forgetting problem during supervised parameter updates. To mitigate this issue, the authors introduce multi-head DT (MH-DT) and low-rank adaptation DT (LoRA-DT), which leverage distillation and selective rehearsal to enhance current task learning. The experiments on MoJuCo and Meta-World benchmarks demonstrate that these methods outperform state-of-the-art CORL baselines and showcase enhanced learning capabilities and superior memory efficiency. | 
| Low | GrooveSquid.com (original content) | Low Difficulty Summary This paper is about making machines learn new things from old data without forgetting what they already know. They want to find a better way for computers to learn, so they’re comparing different methods to see which one works best. The current method is good at some things but bad at others. A new method called Decision Transformer looks promising, but it has its own problems. To fix this, the authors came up with two new ideas: multi-head DT and low-rank adaptation DT. These ideas help computers learn better and remember more. They tested these ideas on different tasks and found that they work much better than the old way. | 
Keywords
* Artificial intelligence * Distillation * Generalization * Lora * Low rank adaptation * Reinforcement learning * Supervised * Transformer * Zero shot




