Summary of Unforgettable Generalization in Language Models, by Eric Zhang et al.

Unforgettable Generalization in Language Models

by Eric Zhang, Leshem Chosen, Jacob Andreas

First submitted to arxiv on: 3 Sep 2024

GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty	Written by	Summary
High	Paper authors	High Difficulty Summary Read the original abstract here
Medium	GrooveSquid.com (original content)	Medium Difficulty Summary A study investigates how language models (LMs) change their behavior when they are trained to “unlearn” a skill, specifically through fine-tuning on randomized labels. The research explores transformer LMs that have been forgotten via this process and finds that their predictions exhibit extreme variability depending on the task. In some cases, such as entailment classification, forgetting generalizes robustly, leading to uninformative predictions on new instances. However, in other tasks like physical commonsense reasoning and scientific question answering, forgetting only affects training examples, and models continue to perform accurately even for similar instances. The study identifies dataset difficulty as not predictive of forgettable behaviors and instead finds that initial task confidence and representation variability are weakly associated with generalization. Surprisingly, random-label forgetting is insensitive to the training set’s contents, with models trained on science questions performing well on other tasks but producing random labels on entailment classification. The results show that even generalizable forgetting is shallow, allowing linear probes to perform tasks reliably after forgetting.
Low	GrooveSquid.com (original content)	Low Difficulty Summary Language models are like super smart computers that can learn and do lots of things! But sometimes we want them to “forget” something they know so they don’t use it anymore. This study looked at how language models change when they’re trained to forget a skill, specifically through fine-tuning on random labels. They found out that the models behave differently depending on what task they were trying to forget. Sometimes, the models will just make random guesses, but other times they’ll still do well even if they don’t know the answer anymore! It’s like their brains are saying “oh yeah, I remember this!” or “hmm, I’m not sure about that one!”

Keywords

* Artificial intelligence * Classification * Fine tuning * Generalization * Question answering * Transformer

Unforgettable Generalization in Language Models

by Eric Zhang, Leshem Chosen, Jacob Andreas

Categories

GrooveSquid.com Paper Summaries

Keywords

Summary of The Application Of Artificial Neural Network Model to Predicting the Acid Mine Drainage From Long-term Lab Scale Kinetic Test, by Muhammad Sonny Abfertiawan et al.

Summary of A Lesion-aware Edge-based Graph Neural Network For Predicting Language Ability in Patients with Post-stroke Aphasia, by Zijian Chen et al.

Related Posts