Loading Now

Summary of In-context Learning with Representations: Contextual Generalization Of Trained Transformers, by Tong Yang et al.


In-Context Learning with Representations: Contextual Generalization of Trained Transformers

by Tong Yang, Yu Huang, Yingbin Liang, Yuejie Chi

First submitted to arxiv on: 19 Aug 2024

Categories

  • Main: Machine Learning (cs.LG)
  • Secondary: Computation and Language (cs.CL); Information Theory (cs.IT); Optimization and Control (math.OC); Machine Learning (stat.ML)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
Transformers have been shown to possess in-context learning capabilities, where they can learn new tasks given few examples during inference. However, the theoretical understanding of this phenomenon is under-explored, particularly regarding whether transformers can generalize to unseen examples in a prompt. This paper investigates the training dynamics of transformers through non-linear regression tasks and demonstrates that one-layer multi-head transformers can converge linearly to a global minimum when predicting unlabeled inputs given partially labeled prompts with Gaussian noise. The study shows that transformers learn to perform ridge regression over basis functions, effectively learning contextual information to generalize to both unseen examples and tasks.
Low GrooveSquid.com (original content) Low Difficulty Summary
Transformers are super smart machines that can learn new things just by seeing a few examples! But have you ever wondered how they do it? This paper tries to figure out the secrets behind this magic. It shows that these transformers can actually learn from tiny bits of information, like a few sentences or words. And get this – they don’t even need the whole answer to be right! As long as they see some clues, they can make educated guesses and improve their skills. This study is important because it helps us understand how machines can learn and adapt in real-life situations.

Keywords

» Artificial intelligence  » Inference  » Linear regression  » Prompt  » Regression