Summary of Understanding Reasoning Ability Of Language Models From the Perspective Of Reasoning Paths Aggregation, by Xinyi Wang et al.
Understanding Reasoning Ability of Language Models From the Perspective of Reasoning Paths Aggregation
by Xinyi Wang, Alfonso Amayuelas, Kexun Zhang, Liangming Pan, Wenhu Chen, William Yang Wang
First submitted to arxiv on: 5 Feb 2024
Categories
- Main: Machine Learning (cs.LG)
- Secondary: Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary The paper investigates the emergence of complex reasoning capabilities in pre-trained language models (LMs) without explicit fine-tuning. The authors propose a perspective where LMs derive new conclusions by aggregating indirect reasoning paths seen during pre-training. This is demonstrated through two cases: logic reasoning with knowledge graphs and chain-of-thought reasoning. By formalizing the reasoning paths as random walk paths on knowledge/reasoning graphs, the authors analyze the learned LM distributions and suggest that a weighted sum of relevant probabilities explains how LMs reason. Experiments on multiple datasets reveal the effect of training on random walk paths and show that augmenting unlabeled random walk reasoning paths can improve real-world multi-step reasoning performance. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary The paper looks at how pre-trained language models can make smart conclusions without needing extra training. The authors think that these models work by combining different “reasoning paths” they learned during their initial training. They test this idea with two types of reasoning: using knowledge graphs and chain-of-thought reasoning. By treating the reasoning paths like a random walk on a graph, the authors can understand how the model makes decisions. Their experiments show that training models to follow these reasoning paths can make them better at making smart conclusions in real-world situations. |
Keywords
* Artificial intelligence * Fine tuning