Summary of What Do Language Models Learn in Context? the Structured Task Hypothesis, by Jiaoda Li et al.
What Do Language Models Learn in Context? The Structured Task Hypothesis
by Jiaoda Li, Yifan Hou, Mrinmaya Sachan, Ryan Cotterell
First submitted to arxiv on: 6 Jun 2024
Categories
- Main: Computation and Language (cs.CL)
- Secondary: Machine Learning (cs.LG)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary Medium Difficulty summary: Large language models (LLMs) exhibit impressive abilities to learn new tasks from demonstrations, known as in-context learning (ICL). Researchers have proposed three theories to explain this phenomenon: task selection, meta-learning, and composition of tasks. Our paper empirically investigates these hypotheses through a suite of experiments derived from common text classification tasks. We provide evidence that invalidates the task selection and meta-learning hypotheses, while supporting the composition of tasks hypothesis. Our results suggest that LLMs can learn novel tasks in context by combining tasks learned during pre-training. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary Low Difficulty summary: Imagine teaching a computer new skills without showing it how to do the job first! This is called “in-context learning” (ICL). Scientists have tried to figure out why computers are good at this. They proposed three ideas: the computer picks the task, the computer learns a new way of learning, or the computer combines old knowledge to learn something new. Our study tested these ideas and found that the third idea is correct. This means that computers can learn new tasks by combining what they already know. |
Keywords
» Artificial intelligence » Meta learning » Text classification