Loading Now

Summary of What Do Language Models Learn in Context? the Structured Task Hypothesis, by Jiaoda Li et al.


What Do Language Models Learn in Context? The Structured Task Hypothesis

by Jiaoda Li, Yifan Hou, Mrinmaya Sachan, Ryan Cotterell

First submitted to arxiv on: 6 Jun 2024

Categories

  • Main: Computation and Language (cs.CL)
  • Secondary: Machine Learning (cs.LG)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
Medium Difficulty summary: Large language models (LLMs) exhibit impressive abilities to learn new tasks from demonstrations, known as in-context learning (ICL). Researchers have proposed three theories to explain this phenomenon: task selection, meta-learning, and composition of tasks. Our paper empirically investigates these hypotheses through a suite of experiments derived from common text classification tasks. We provide evidence that invalidates the task selection and meta-learning hypotheses, while supporting the composition of tasks hypothesis. Our results suggest that LLMs can learn novel tasks in context by combining tasks learned during pre-training.
Low GrooveSquid.com (original content) Low Difficulty Summary
Low Difficulty summary: Imagine teaching a computer new skills without showing it how to do the job first! This is called “in-context learning” (ICL). Scientists have tried to figure out why computers are good at this. They proposed three ideas: the computer picks the task, the computer learns a new way of learning, or the computer combines old knowledge to learn something new. Our study tested these ideas and found that the third idea is correct. This means that computers can learn new tasks by combining what they already know.

Keywords

» Artificial intelligence  » Meta learning  » Text classification