Loading Now

Summary of Blinded by Generated Contexts: How Language Models Merge Generated and Retrieved Contexts When Knowledge Conflicts?, By Hexiang Tan et al.


Blinded by Generated Contexts: How Language Models Merge Generated and Retrieved Contexts When Knowledge Conflicts?

by Hexiang Tan, Fei Sun, Wanli Yang, Yuanzhuo Wang, Qi Cao, Xueqi Cheng

First submitted to arxiv on: 22 Jan 2024

Categories

  • Main: Computation and Language (cs.CL)
  • Secondary: Artificial Intelligence (cs.AI)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
While Large Language Models (LLMs) have seen significant advancements with auxiliary information, understanding how they combine these contexts remains a crucial challenge. In this paper, we develop a systematic framework to identify whether LLM responses are attributed to generated or retrieved contexts. To facilitate this analysis, we create datasets featuring conflicting contexts, where each question is paired with both generated and retrieved contexts, only one containing the correct answer. Our experiments reveal a significant bias in several LLMs (GPT-4/3.5 and Llama2) towards favoring generated contexts, even when they provide incorrect information. This bias is attributed to two key factors: i) generated contexts’ similarity to questions increases their likelihood of being selected; ii) the segmentation process used in retrieved contexts disrupts their completeness, hindering full utilization. Our findings enhance understanding of LLM context merging, offer valuable insights for advancing current augmentation methods, and highlight the risk of generated misinformation for retrieval-augmented LLMs.
Low GrooveSquid.com (original content) Low Difficulty Summary
This paper is about how Large Language Models (LLMs) use information from different sources to answer questions. The authors created special datasets with multiple versions of each question, one with correct answers and others with incorrect ones. They tested several LLM models on these datasets and found that some models tend to choose the wrong information if it’s presented in a certain way. This is because the models are more likely to choose information that seems similar to the question, even if it’s not accurate. The authors’ research helps us understand how LLMs combine different sources of information and what can go wrong.

Keywords

» Artificial intelligence  » Gpt  » Likelihood