Loading Now

Summary of Localizing Paragraph Memorization in Language Models, by Niklas Stoehr et al.


Localizing Paragraph Memorization in Language Models

by Niklas Stoehr, Mitchell Gordon, Chiyuan Zhang, Owen Lewis

First submitted to arxiv on: 28 Mar 2024

Categories

  • Main: Computation and Language (cs.CL)
  • Secondary: Cryptography and Security (cs.CR); Machine Learning (cs.LG); Machine Learning (stat.ML)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
This paper explores the phenomenon of language models memorizing and reciting paragraphs from their training data. By analyzing gradients of memorized and non-memorized examples, researchers find that memorization is spread across multiple layers and model components, with a distinguishable spatial pattern in lower model layers. The study also reveals that fine-tuning only high-gradient weights can unlearn memorized examples. Additionally, the authors localize a low-layer attention head involved in paragraph memorization, which focuses on rare tokens least frequent in a corpus-level distribution.
Low GrooveSquid.com (original content) Low Difficulty Summary
This paper looks at how language models remember and repeat paragraphs from their training data. Researchers found that when language models “remember” something, it’s not just one part of the model doing it – it’s spread across multiple parts. They also discovered that some parts are more important for memorization than others. The study shows that if you tweak only those important parts, you can make the model forget what it learned. It also found a special “attention head” in the model that helps with memorization and is very good at focusing on rare words.

Keywords

* Artificial intelligence  * Attention  * Fine tuning