Loading Now

Summary of Knowledge Entropy Decay During Language Model Pretraining Hinders New Knowledge Acquisition, by Jiyeon Kim et al.


Knowledge Entropy Decay during Language Model Pretraining Hinders New Knowledge Acquisition

by Jiyeon Kim, Hyunji Lee, Hyowon Cho, Joel Jang, Hyeonbin Hwang, Seungpil Won, Youbin Ahn, Dohaeng Lee, Minjoon Seo

First submitted to arxiv on: 2 Oct 2024

Categories

  • Main: Computation and Language (cs.CL)
  • Secondary: Artificial Intelligence (cs.AI)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
The paper investigates how a model’s ability to integrate its parametric knowledge evolves during pretraining, and how this affects overall performance in terms of knowledge acquisition and forgetting. The authors introduce the concept of knowledge entropy, which measures the range of memory sources used by the model, and find that as pretraining advances, knowledge entropy consistently declines. This decline is associated with a reduction in the model’s ability to acquire and retain knowledge, suggesting that diminishing knowledge entropy impairs the model’s knowledge acquisition and retention capabilities. The authors also demonstrate that increasing the activity of inactive memory sources enhances the model’s capacity for knowledge acquisition and retention.
Low GrooveSquid.com (original content) Low Difficulty Summary
The paper looks at how a computer program learns and remembers things as it gets better. It finds that when the program is learning, it starts to focus on specific memories rather than using all its memories equally. This makes it harder for the program to learn new things and remember them later. The authors show that if they make the program use more of its memories again, it can start learning and remembering better.

Keywords

» Artificial intelligence  » Pretraining