Summary of Memorizing Documents with Guidance in Large Language Models, by Bumjin Park and Jaesik Choi

Memorizing Documents with Guidance in Large Language Models

by Bumjin Park, Jaesik Choi

First submitted to arxiv on: 23 Jun 2024

GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty	Written by	Summary
High	Paper authors	High Difficulty Summary Read the original abstract here
Medium	GrooveSquid.com (original content)	Medium Difficulty Summary The proposed document-wise memory architecture tracks document memories during training, mapping document representations to memory entries. This approach softly masks memories in the forward process of large language models (LLMs), allowing for more accurate tracking of document-related contents. The architecture is combined with a novel document guidance loss function that increases the likelihood of text containing document memories and reduces the likelihood of text containing memories from other documents. Experimental results on Wikitext-103-v1 with Pythia-1B demonstrate improved recall of document-related content in generation with trained document-wise memories.
Low	GrooveSquid.com (original content)	Low Difficulty Summary Large language models are super powerful tools, but they can only be as good as the data they’re trained on! Recently, researchers have found ways to identify specific parts of these models that relate to certain documents. Instead of trying to figure out what’s going on after training, this new approach tries to track memories from individual documents during the training process itself. This helps create more accurate language generation and allows for better recall of information related to specific documents.

Keywords

* Artificial intelligence * Likelihood * Loss function * Recall * Tracking

Memorizing Documents with Guidance in Large Language Models

by Bumjin Park, Jaesik Choi

Categories

GrooveSquid.com Paper Summaries

Keywords

Summary of Hcqa @ Ego4d Egoschema Challenge 2024, by Haoyu Zhang et al.

Summary of Combining Supervised Learning and Reinforcement Learning For Multi-label Classification Tasks with Partial Labels, by Zixia Jia et al.

Related Posts