Loading Now

Summary of Where Is the Answer? Investigating Positional Bias in Language Model Knowledge Extraction, by Kuniaki Saito et al.


Where is the answer? Investigating Positional Bias in Language Model Knowledge Extraction

by Kuniaki Saito, Kihyuk Sohn, Chen-Yu Lee, Yoshitaka Ushiku

First submitted to arxiv on: 16 Feb 2024

Categories

  • Main: Computation and Language (cs.CL)
  • Secondary: Artificial Intelligence (cs.AI)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
This paper explores how large language models (LLMs) can accurately answer questions about specific sentences, but struggle to extract information from other parts of the documents used for training. The authors propose that this phenomenon, known as the perplexity curse, is caused by auto-regressive training, which relies on previous tokens and hinders the model’s ability to recall information from diverse positions. To investigate this issue, the researchers created both synthetic and real datasets to evaluate the QA performance of LLMs w.r.t. the position of answers in documents. The study found that even large models suffer from the perplexity curse, but regularization techniques such as denoising auto-regressive loss can improve information extraction.
Low GrooveSquid.com (original content) Low Difficulty Summary
Large language models need updates to stay current or adapt to new topics by fine-tuning them with new documents. One key is remembering the latest info in a way that makes it easy to extract with a prompt sentence. However, LLMs have a problem called perplexity curse; even when they’re trained well, they struggle to find answers from middle or end of the document. The study found that auto-regressive training causes this issue because each token relies on all previous tokens, making it hard for the model to remember info from documents by question prompts. Researchers created datasets and showed that even big models have this problem, but a technique called denoising auto-regressive loss can help.

Keywords

» Artificial intelligence  » Fine tuning  » Perplexity  » Prompt  » Recall  » Regularization  » Token