Summary of In-context Learning with Representations: Contextual Generalization Of Trained Transformers, by Tong Yang et al.

In-Context Learning with Representations: Contextual Generalization of Trained Transformers

by Tong Yang, Yu Huang, Yingbin Liang, Yuejie Chi

First submitted to arxiv on: 19 Aug 2024

GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty	Written by	Summary
High	Paper authors	High Difficulty Summary Read the original abstract here
Medium	GrooveSquid.com (original content)	Medium Difficulty Summary Transformers have been shown to possess in-context learning capabilities, where they can learn new tasks given few examples during inference. However, the theoretical understanding of this phenomenon is under-explored, particularly regarding whether transformers can generalize to unseen examples in a prompt. This paper investigates the training dynamics of transformers through non-linear regression tasks and demonstrates that one-layer multi-head transformers can converge linearly to a global minimum when predicting unlabeled inputs given partially labeled prompts with Gaussian noise. The study shows that transformers learn to perform ridge regression over basis functions, effectively learning contextual information to generalize to both unseen examples and tasks.
Low	GrooveSquid.com (original content)	Low Difficulty Summary Transformers are super smart machines that can learn new things just by seeing a few examples! But have you ever wondered how they do it? This paper tries to figure out the secrets behind this magic. It shows that these transformers can actually learn from tiny bits of information, like a few sentences or words. And get this – they don’t even need the whole answer to be right! As long as they see some clues, they can make educated guesses and improve their skills. This study is important because it helps us understand how machines can learn and adapt in real-life situations.

Keywords

* Artificial intelligence * Inference * Linear regression * Prompt * Regression

In-Context Learning with Representations: Contextual Generalization of Trained Transformers

by Tong Yang, Yu Huang, Yingbin Liang, Yuejie Chi

Categories

GrooveSquid.com Paper Summaries

Keywords

Summary of The Exploration-exploitation Dilemma Revisited: An Entropy Perspective, by Renye Yan et al.

Summary of Smile: Zero-shot Sparse Mixture Of Low-rank Experts Construction From Pre-trained Foundation Models, by Anke Tang et al.

Related Posts