Loading Now

Summary of What Will My Model Forget? Forecasting Forgotten Examples in Language Model Refinement, by Xisen Jin et al.


What Will My Model Forget? Forecasting Forgotten Examples in Language Model Refinement

by Xisen Jin, Xiang Ren

First submitted to arxiv on: 2 Feb 2024

Categories

  • Main: Machine Learning (cs.LG)
  • Secondary: Computation and Language (cs.CL); Machine Learning (stat.ML)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
This research paper proposes innovative solutions to mitigate catastrophic forgetting in language models deployed in the wild. The authors observe that updating the model with corrected error instances leads to catastrophic forgetting, where the updated model makes errors on previously learned instances. To address this issue, they develop forecasting models that predict which upstream pre-training examples will be forgotten due to a model update. They train these forecasting models using online learned examples and corresponding forgotten upstream pre-training examples. The authors propose two forecasting models: a partially interpretable model based on pre-softmax logit scores and a black-box classifier based on inner products of example representations. Experimental results show that the black-box classifier outperforms the partially interpretable model on various setups, including BART and T5 models. By replaying examples forecasted to be forgotten, the authors demonstrate the practical utility of their approach in reducing forgetting of upstream pre-training examples.
Low GrooveSquid.com (original content) Low Difficulty Summary
This paper solves a big problem with language models that makes mistakes when they’re updated. When you update a model to fix its mistakes, it often forgets what it learned before! To solve this, scientists developed special forecasting models that predict which old examples will be forgotten because of the update. They tested these models on two types of language models, BART and T5, and found that one approach worked better than the other. By using these forecasts to replay old examples, they showed that it’s possible to reduce forgetting and make language models more reliable.

Keywords

* Artificial intelligence  * Softmax  * T5