Loading Now

Summary of Do Llms Find Human Answers to Fact-driven Questions Perplexing? a Case Study on Reddit, by Parker Seegmiller et al.


Do LLMs Find Human Answers To Fact-Driven Questions Perplexing? A Case Study on Reddit

by Parker Seegmiller, Joseph Gatto, Omar Sharif, Madhusudan Basak, Sarah Masud Preum

First submitted to arxiv on: 1 Apr 2024

Categories

  • Main: Computation and Language (cs.CL)
  • Secondary: Machine Learning (cs.LG)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
Large language models (LLMs) have been shown to excel in answering questions within online discourse. However, the study of using LLMs to model human-like responses to fact-driven social media questions remains under-explored. This work investigates how LLMs model diverse human answers to fact-driven queries on topic-specific Reddit communities (subreddits). A dataset of 409 fact-driven questions and 7,534 human-rated answers from 15 r/Ask{Topic} communities across 3 categories is collected and released. The study finds that LLMs are more effective at modeling highly-rated human answers compared to poorly-rated ones.
Low GrooveSquid.com (original content) Low Difficulty Summary
Large language models can answer questions online. But what happens when they try to give good answers to tricky questions on social media? This research looks into how these models work on special communities on Reddit called subreddits. They collected a big dataset of 409 questions and 7,534 human-rated answers from 15 different subreddits about professions, social identities, and places. The results show that these models are better at giving good answers than bad ones.

Keywords

» Artificial intelligence  » Discourse