Loading Now

Summary of Procedural Knowledge in Pretraining Drives Reasoning in Large Language Models, by Laura Ruis et al.


Procedural Knowledge in Pretraining Drives Reasoning in Large Language Models

by Laura Ruis, Maximilian Mozes, Juhan Bae, Siddhartha Rao Kamalakara, Dwarak Talupuru, Acyr Locatelli, Robert Kirk, Tim Rocktäschel, Edward Grefenstette, Max Bartolo

First submitted to arxiv on: 19 Nov 2024

Categories

  • Main: Computation and Language (cs.CL)
  • Secondary: Machine Learning (cs.LG)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
Large Language Models (LLMs) have been extensively studied for their capabilities and limitations. While LLMs can solve problems, they also exhibit surprising reasoning gaps compared to humans, raising questions about the robustness of their generalisation strategies. To overcome the challenge of measuring generalisation in LLMs, we investigate the pretraining data used by two models (7B and 35B) with 2.5B tokens. We identify influential documents for three mathematical reasoning tasks and contrast this to factual question answers. Our findings show that while distinct sets of data influence each factual question, a document often has similar influence across different reasoning questions within the same task, indicating procedural knowledge presence. Additionally, we find that answer influences vary between factual and reasoning questions. When characterising top-ranked documents for reasoning questions qualitatively, we confirm that influential documents contain procedural knowledge, demonstrating formulae or code usage. Our results suggest LLMs use a generalisable strategy synthesising procedural knowledge from documents doing similar reasoning.
Low GrooveSquid.com (original content) Low Difficulty Summary
This paper is about how well Large Language Models (LLMs) can solve problems and make smart decisions. Recent studies have shown that while LLMs are good at solving some problems, they also make mistakes when compared to humans. To understand why this happens, researchers studied the data used by two different LLM models to see what influences their answers. They found that for simple math problems, the same documents often influence the model’s answer, but not for answering factual questions. This suggests that LLMs use a special way of thinking called “procedural knowledge” when solving problems. Overall, this study helps us understand how LLMs work and what they’re good at.

Keywords

» Artificial intelligence  » Pretraining