Loading Now

Summary of Microscopic Analysis on Llm Players Via Social Deduction Game, by Byungjun Kim et al.


Microscopic Analysis on LLM players via Social Deduction Game

by Byungjun Kim, Dayeon Seo, Bugeun Kim

First submitted to arxiv on: 19 Aug 2024

Categories

  • Main: Artificial Intelligence (cs.AI)
  • Secondary: Computation and Language (cs.CL)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
The paper proposes an evaluation approach for large language model (LLM) players in social deduction games, specifically focusing on the SpyFall game variant, SpyGame. The existing evaluation methods are criticized for relying too heavily on game-level outcomes and lacking structured methodologies for error analysis. To address these limitations, the authors introduce eight quantitative metrics to assess intent identification and camouflage skills, which are found to be more effective than previous methods. Additionally, a qualitative thematic analysis is performed to identify categories that affect gameplay, complementing the findings from the quantitative analysis.
Low GrooveSquid.com (original content) Low Difficulty Summary
The paper looks at how AI can play games like SpyFall better. Right now, people evaluate these AI players by looking at what happens in the whole game. But this isn’t very helpful because it doesn’t tell us much about why the AI made certain moves. The researchers want to change this by introducing new ways to measure AI performance. They used a special version of SpyFall called SpyGame and tested four different AI models. To figure out how well each model did, they came up with eight specific metrics that help us understand what the AI is good or bad at. This helps us see which parts of the game the AI excels in, like figuring out people’s intentions or hiding its own plans.

Keywords

» Artificial intelligence  » Large language model