Summary of Microscopic Analysis on Llm Players Via Social Deduction Game, by Byungjun Kim et al.
Microscopic Analysis on LLM players via Social Deduction Game
by Byungjun Kim, Dayeon Seo, Bugeun Kim
First submitted to arxiv on: 19 Aug 2024
Categories
- Main: Artificial Intelligence (cs.AI)
- Secondary: Computation and Language (cs.CL)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary The paper proposes an evaluation approach for large language model (LLM) players in social deduction games, specifically focusing on the SpyFall game variant, SpyGame. The existing evaluation methods are criticized for relying too heavily on game-level outcomes and lacking structured methodologies for error analysis. To address these limitations, the authors introduce eight quantitative metrics to assess intent identification and camouflage skills, which are found to be more effective than previous methods. Additionally, a qualitative thematic analysis is performed to identify categories that affect gameplay, complementing the findings from the quantitative analysis. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary The paper looks at how AI can play games like SpyFall better. Right now, people evaluate these AI players by looking at what happens in the whole game. But this isn’t very helpful because it doesn’t tell us much about why the AI made certain moves. The researchers want to change this by introducing new ways to measure AI performance. They used a special version of SpyFall called SpyGame and tested four different AI models. To figure out how well each model did, they came up with eight specific metrics that help us understand what the AI is good or bad at. This helps us see which parts of the game the AI excels in, like figuring out people’s intentions or hiding its own plans. |
Keywords
» Artificial intelligence » Large language model