Summary of Strategist: Learning Strategic Skills by Llms Via Bi-level Tree Search, By Jonathan Light and Min Cai and Weiqin Chen and Guanzhi Wang and Xiusi Chen and Wei Cheng and Yisong Yue and Ziniu Hu
Strategist: Learning Strategic Skills by LLMs via Bi-Level Tree Search
by Jonathan Light, Min Cai, Weiqin Chen, Guanzhi Wang, Xiusi Chen, Wei Cheng, Yisong Yue, Ziniu Hu
First submitted to arxiv on: 20 Aug 2024
Categories
- Main: Artificial Intelligence (cs.AI)
- Secondary: Computation and Language (cs.CL)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary The proposed method, STRATEGIST, uses Large Language Models (LLMs) to acquire new skills for playing multi-agent games through self-improvement. By gathering quality feedback from simulations with Monte Carlo tree search and LLM-based reflection, STRATEGIST learns high-level strategic skills like evaluating game states that guide low-level execution. This method is demonstrated in action planning and dialogue generation tasks in game contexts, outperforming traditional reinforcement learning and other LLM-based approaches on games such as Game of Pure Strategy (GOPS) and The Resistance: Avalon. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary STRATEGIST uses big language models to help computers play complex strategy games better. It does this by letting the computer learn from playing the game against itself, getting feedback from what it’s doing well or poorly. This helps the computer develop high-level skills for making good moves and decisions. STRATEGIST is shown to work well in two different types of tasks: planning actions and generating dialogue. It even outperforms other ways computers have learned to play games like Game of Pure Strategy and The Resistance: Avalon. |
Keywords
* Artificial intelligence * Reinforcement learning