Loading Now

Summary of Rectifying Demonstration Shortcut in In-context Learning, by Joonwon Jang et al.


Rectifying Demonstration Shortcut in In-Context Learning

by Joonwon Jang, Sanghwan Jang, Wonbin Kweon, Minjin Jeon, Hwanjo Yu

First submitted to arxiv on: 14 Mar 2024

Categories

  • Main: Computation and Language (cs.CL)
  • Secondary: Artificial Intelligence (cs.AI)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
In this research paper, the authors investigate how large language models (LLMs) learn from a few demonstrations using their in-context learning abilities. They identify a phenomenon called the “Demonstration Shortcut,” where LLMs rely on pre-trained semantic priors rather than input-label relationships to make predictions. The authors propose an In-Context Calibration method to rectify this shortcut and enable LLMs to effectively learn new input-label relationships from demonstrations. The method is evaluated in two settings, with substantial improvements demonstrated across three LLM families (OPT, GPT, and Llama2) under various configurations.
Low GrooveSquid.com (original content) Low Difficulty Summary
Large language models can solve many tasks after just a few examples. They use something called “in-context learning” to do this. But they often rely on what they already know about words and meanings instead of the specific example they’re trying to learn from. This is like taking a shortcut when you’re really supposed to be paying attention. The researchers in this paper want to help the models do better by giving them a way to focus on the important parts of each example. They tested their idea with three different types of language models and found that it made a big difference.

Keywords

» Artificial intelligence  » Attention  » Gpt