Loading Now

Summary of Memorizing Documents with Guidance in Large Language Models, by Bumjin Park and Jaesik Choi


Memorizing Documents with Guidance in Large Language Models

by Bumjin Park, Jaesik Choi

First submitted to arxiv on: 23 Jun 2024

Categories

  • Main: Computation and Language (cs.CL)
  • Secondary: Artificial Intelligence (cs.AI)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
The proposed document-wise memory architecture tracks document memories during training, mapping document representations to memory entries. This approach softly masks memories in the forward process of large language models (LLMs), allowing for more accurate tracking of document-related contents. The architecture is combined with a novel document guidance loss function that increases the likelihood of text containing document memories and reduces the likelihood of text containing memories from other documents. Experimental results on Wikitext-103-v1 with Pythia-1B demonstrate improved recall of document-related content in generation with trained document-wise memories.
Low GrooveSquid.com (original content) Low Difficulty Summary
Large language models are super powerful tools, but they can only be as good as the data they’re trained on! Recently, researchers have found ways to identify specific parts of these models that relate to certain documents. Instead of trying to figure out what’s going on after training, this new approach tries to track memories from individual documents during the training process itself. This helps create more accurate language generation and allows for better recall of information related to specific documents.

Keywords

» Artificial intelligence  » Likelihood  » Loss function  » Recall  » Tracking