Summary of Knowledge Entropy Decay During Language Model Pretraining Hinders New Knowledge Acquisition, by Jiyeon Kim et al.

Knowledge Entropy Decay during Language Model Pretraining Hinders New Knowledge Acquisition

by Jiyeon Kim, Hyunji Lee, Hyowon Cho, Joel Jang, Hyeonbin Hwang, Seungpil Won, Youbin Ahn, Dohaeng Lee, Minjoon Seo

First submitted to arxiv on: 2 Oct 2024

GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty	Written by	Summary
High	Paper authors	High Difficulty Summary Read the original abstract here
Medium	GrooveSquid.com (original content)	Medium Difficulty Summary The paper investigates how a model’s ability to integrate its parametric knowledge evolves during pretraining, and how this affects overall performance in terms of knowledge acquisition and forgetting. The authors introduce the concept of knowledge entropy, which measures the range of memory sources used by the model, and find that as pretraining advances, knowledge entropy consistently declines. This decline is associated with a reduction in the model’s ability to acquire and retain knowledge, suggesting that diminishing knowledge entropy impairs the model’s knowledge acquisition and retention capabilities. The authors also demonstrate that increasing the activity of inactive memory sources enhances the model’s capacity for knowledge acquisition and retention.
Low	GrooveSquid.com (original content)	Low Difficulty Summary The paper looks at how a computer program learns and remembers things as it gets better. It finds that when the program is learning, it starts to focus on specific memories rather than using all its memories equally. This makes it harder for the program to learn new things and remember them later. The authors show that if they make the program use more of its memories again, it can start learning and remembering better.

Keywords

» Artificial intelligence » Pretraining

Knowledge Entropy Decay during Language Model Pretraining Hinders New Knowledge Acquisition

by Jiyeon Kim, Hyunji Lee, Hyowon Cho, Joel Jang, Hyeonbin Hwang, Seungpil Won, Youbin Ahn, Dohaeng Lee, Minjoon Seo

Categories

GrooveSquid.com Paper Summaries

Keywords

Summary of Polyp-ses: Automatic Polyp Segmentation with Self-enriched Semantic Model, by Quang Vinh Nguyen et al.

Summary of Bridging Context Gaps: Leveraging Coreference Resolution For Long Contextual Understanding, by Yanming Liu et al.

Related Posts