Summary of Multi-state-action Tokenisation in Decision Transformers For Multi-discrete Action Spaces, by Perusha Moodley et al.

Multi-State-Action Tokenisation in Decision Transformers for Multi-Discrete Action Spaces

by Perusha Moodley, Pramod Kaushik, Dhillu Thambi, Mark Trovinger, Praveen Paruchuri, Xia Hong, Benjamin Rosman

First submitted to arxiv on: 1 Jul 2024

GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty	Written by	Summary
High	Paper authors	High Difficulty Summary Read the original abstract here
Medium	GrooveSquid.com (original content)	Medium Difficulty Summary Decision Transformers struggle to perform on image-based environments with multi-discrete action spaces, despite enhanced architectures. Our proposed Multi-State Action Tokenisation (M-SAT) addresses this issue by tokenising actions at the individual level and incorporating auxiliary state information. This approach disentangles actions, improving interpretability and visibility within attention layers. We demonstrate M-SAT’s performance gains on challenging ViZDoom environments with multi-discrete action spaces, outperforming the baseline Decision Transformer without additional data or computational overheads. Surprisingly, removing positional encoding can even improve M-SAT’s performance in some cases.
Low	GrooveSquid.com (original content)	Low Difficulty Summary Decision Transformers have trouble working with images and lots of different actions. We created a new way to do this called Multi-State Action Tokenisation (M-SAT). It helps by breaking down the actions into smaller parts and adding extra information about what’s happening. This makes it easier to understand what the model is doing and why. Our tests show that M-SAT works better than regular Decision Transformers in tricky situations, and it doesn’t need extra data or powerful computers.

Keywords

* Artificial intelligence * Attention * Positional encoding * Transformer

Multi-State-Action Tokenisation in Decision Transformers for Multi-Discrete Action Spaces

by Perusha Moodley, Pramod Kaushik, Dhillu Thambi, Mark Trovinger, Praveen Paruchuri, Xia Hong, Benjamin Rosman

Categories

GrooveSquid.com Paper Summaries

Keywords

Summary of Collaborative Performance Prediction For Large Language Models, by Qiyuan Zhang et al.

Summary of Evaluating Model Performance Under Worst-case Subpopulations, by Mike Li et al.

Related Posts