Summary of What Will My Model Forget? Forecasting Forgotten Examples in Language Model Refinement, by Xisen Jin et al.

by Xisen Jin, Xiang Ren

First submitted to arxiv on: 2 Feb 2024

GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty	Written by	Summary
High	Paper authors	High Difficulty Summary Read the original abstract here
Medium	GrooveSquid.com (original content)	Medium Difficulty Summary This research paper proposes innovative solutions to mitigate catastrophic forgetting in language models deployed in the wild. The authors observe that updating the model with corrected error instances leads to catastrophic forgetting, where the updated model makes errors on previously learned instances. To address this issue, they develop forecasting models that predict which upstream pre-training examples will be forgotten due to a model update. They train these forecasting models using online learned examples and corresponding forgotten upstream pre-training examples. The authors propose two forecasting models: a partially interpretable model based on pre-softmax logit scores and a black-box classifier based on inner products of example representations. Experimental results show that the black-box classifier outperforms the partially interpretable model on various setups, including BART and T5 models. By replaying examples forecasted to be forgotten, the authors demonstrate the practical utility of their approach in reducing forgetting of upstream pre-training examples.
Low	GrooveSquid.com (original content)	Low Difficulty Summary This paper solves a big problem with language models that makes mistakes when they’re updated. When you update a model to fix its mistakes, it often forgets what it learned before! To solve this, scientists developed special forecasting models that predict which old examples will be forgotten because of the update. They tested these models on two types of language models, BART and T5, and found that one approach worked better than the other. By using these forecasts to replay old examples, they showed that it’s possible to reduce forgetting and make language models more reliable.

Keywords

* Artificial intelligence * Softmax * T5

What Will My Model Forget? Forecasting Forgotten Examples in Language Model Refinement

by Xisen Jin, Xiang Ren

Categories

GrooveSquid.com Paper Summaries

Keywords

Summary of Leveraging Large Language Models For Structure Learning in Prompted Weak Supervision, by Jinyan Su et al.

Summary of Improving Large-scale K-nearest Neighbor Text Categorization with Label Autoencoders, by Francisco J. Ribadas-pena et al.

Related Posts