Loading Now

Summary of Paraphrase and Solve: Exploring and Exploiting the Impact Of Surface Form on Mathematical Reasoning in Large Language Models, by Yue Zhou et al.


Paraphrase and Solve: Exploring and Exploiting the Impact of Surface Form on Mathematical Reasoning in Large Language Models

by Yue Zhou, Yada Zhu, Diego Antognini, Yoon Kim, Yang Zhang

First submitted to arxiv on: 17 Apr 2024

Categories

  • Main: Computation and Language (cs.CL)
  • Secondary: Artificial Intelligence (cs.AI)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
The paper investigates how minor changes to the wording of mathematical problems affect their solvability by large language models. The results show that these subtle alterations can significantly impact the answer distribution and solve rate, highlighting the models’ lack of robustness and sensitivity to surface forms in complex problem-solving. To address this issue, the authors propose Self-Consistency-over-Paraphrases (SCoP), which generates diverse reasoning paths from specific surface forms of problems. The approach is evaluated on four mathematics reasoning benchmarks using three large language models, demonstrating improved performance over vanilla self-consistency, particularly for initially unsolvable problems.
Low GrooveSquid.com (original content) Low Difficulty Summary
The paper looks at how small changes to math problem wording affect whether AI models can solve them. They found that these tiny changes make a big difference in what answers the models give and how often they succeed. This shows that current AI models aren’t very good at solving complex math problems because they’re too focused on the specific way the problem is worded. To fix this, the authors came up with an idea called SCoP (Self-Consistency-over-Paraphrases) that helps AI models generate multiple ways to solve a problem and makes them better at figuring out tricky math questions.

Keywords

» Artificial intelligence