Summary of Researchy Questions: a Dataset Of Multi-perspective, Decompositional Questions For Llm Web Agents, by Corby Rosset et al.
Researchy Questions: A Dataset of Multi-Perspective, Decompositional Questions for LLM Web Agents
by Corby Rosset, Ho-Lam Chung, Guanghui Qin, Ethan C. Chau, Zhuo Feng, Ahmed Awadallah, Jennifer Neville, Nikhil Rao
First submitted to arxiv on: 27 Feb 2024
Categories
- Main: Computation and Language (cs.CL)
- Secondary: Artificial Intelligence (cs.AI)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary In this paper, researchers address a critical gap in question answering (QA) datasets by introducing Researchy Questions, a novel dataset of search engine queries that are non-factoid, multi-perspective, and challenging for Large Language Models (LLMs). The existing QA benchmarks, such as TriviaQA, NaturalQuestions, ELI5, and HotpotQA, primarily focus on “known unknowns” with clear indications of missing information. This leads to a false sense of security, as good performance on these benchmarks does not necessarily translate to real-world scenarios. The new dataset is designed to tackle the unmet need for questions involving unclear information needs, also known as “unknown unknowns.” The authors demonstrate that users spend significant effort on these questions and that they are challenging even for advanced LLMs like GPT-4. Moreover, they show that slow thinking answering techniques, such as decomposition into sub-questions, outperform direct answering approaches. The researchers release a dataset of approximately 100k Researchy Questions, along with the corresponding Clueweb22 URLs. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary This paper helps create better question-answering systems by making them more challenging and realistic. Currently, most powerful language models can easily answer questions from existing datasets, but that’s not how people search for information in real life. The authors created a new dataset of search engine queries that are harder to answer and require more thought. They show that these types of questions are more difficult even for the best language models, and that using a slower and more thoughtful approach can lead to better answers. |
Keywords
» Artificial intelligence » Gpt » Question answering