Loading Now

Summary of Revisiting Catastrophic Forgetting in Large Language Model Tuning, by Hongyu Li et al.


Revisiting Catastrophic Forgetting in Large Language Model Tuning

by Hongyu Li, Liang Ding, Meng Fang, Dacheng Tao

First submitted to arxiv on: 7 Jun 2024

Categories

  • Main: Computation and Language (cs.CL)
  • Secondary: Artificial Intelligence (cs.AI)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
This paper investigates the phenomenon of Catastrophic Forgetting (CF) in large language models (LLMs), where previously acquired knowledge is forgotten during fine-tuning. The authors reveal a direct link between the flatness of the model loss landscape and the extent of CF, and introduce sharpness-aware minimization to mitigate CF by flattening the loss landscape. Experiments on three datasets demonstrate the effectiveness of this method in alleviating CF. The results show that this approach complements existing anti-forgetting strategies, enhancing the resistance of LLMs to CF.
Low GrooveSquid.com (original content) Low Difficulty Summary
This research paper looks at a problem with big language models called Catastrophic Forgetting. It’s when these models forget what they learned before when they’re trained on new data. The authors found that this happens because the model’s “loss landscape” (how it learns) gets flat, and they came up with a way to fix this by making the loss landscape sharper. They tested their idea on three different datasets and showed that it helps models remember what they learned before.

Keywords

» Artificial intelligence  » Fine tuning