Summary of Talking Heads: Understanding Inter-layer Communication in Transformer Language Models, by Jack Merullo et al.

Talking Heads: Understanding Inter-layer Communication in Transformer Language Models

by Jack Merullo, Carsten Eickhoff, Ellie Pavlick

First submitted to arxiv on: 13 Jun 2024

GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty	Written by	Summary
High	Paper authors	High Difficulty Summary Read the original abstract here
Medium	GrooveSquid.com (original content)	Medium Difficulty Summary The paper investigates how transformer language models (LMs) transmit information between layers. It identifies a mechanism used in two LMs to selectively inhibit items in context, which underlies a common abstraction across many context-retrieval behaviors. The researchers find that models write into low-rank subspaces of the residual stream, representing features that are then read out by later layers, forming low-rank communication channels between layers. They also show that this mechanism can explain an otherwise arbitrary-seeming sensitivity to item order in prompts. By decomposing attention heads with Singular Value Decomposition (SVD), they predict interactions between heads separated by one or more layers based on their weight matrices alone. The study demonstrates how manipulating internal model representations and editing model weights can significantly improve performance on a synthetic task, improving task accuracy by over 20%. The analysis reveals an intricate interpretable structure learned from language model pretraining, helping to understand why sophisticated LMs sometimes fail in simple domains.
Low	GrooveSquid.com (original content)	Low Difficulty Summary The paper looks at how language models work inside. It finds a way that some models hide certain information from other parts of the model. This helps explain why some models are better than others at doing tasks like remembering lists. The researchers also show that they can make the models do better by changing what’s going on inside the model. They think this will help us understand why language models sometimes don’t do well, even though they’re very good at lots of things.

Keywords

» Artificial intelligence » Attention » Language model » Pretraining » Transformer

Talking Heads: Understanding Inter-layer Communication in Transformer Language Models

by Jack Merullo, Carsten Eickhoff, Ellie Pavlick

Categories

GrooveSquid.com Paper Summaries

Keywords

Summary of Star: a First-ever Dataset and a Large-scale Benchmark For Scene Graph Generation in Large-size Satellite Imagery, by Yansheng Li et al.

Summary of Mix Q-learning For Lane Changing: a Collaborative Decision-making Method in Multi-agent Deep Reinforcement Learning, by Xiaojun Bi et al.

Related Posts