Summary of Dart-math: Difficulty-aware Rejection Tuning For Mathematical Problem-solving, by Yuxuan Tong et al.
DART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving
by Yuxuan Tong, Xiwen Zhang, Rui Wang, Ruidong Wu, Junxian He
First submitted to arxiv on: 18 Jun 2024
Categories
- Main: Computation and Language (cs.CL)
- Secondary: Artificial Intelligence (cs.AI)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary In this paper, researchers address the limitations of large language models in solving mathematical problems by proposing a novel approach called Difficulty-Aware Rejection Tuning (DART). They argue that previous works have focused too much on easy queries, resulting in biases towards simple problems. To mitigate this issue, DART allocates more trials to difficult queries during the synthesis phase, allowing for more extensive training on challenging samples. The authors create new datasets for mathematical problem-solving that focus on difficult queries and are smaller than previous ones. They fine-tune various base models on these datasets and achieve strong results on 6 mathematical benchmarks, outperforming previous arts despite using smaller datasets and no proprietary models. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary Mathematical problems require advanced reasoning abilities, but large language models struggle to solve them. Researchers usually use data from proprietary models to create new datasets, then fine-tune their models for top-tier results. However, these datasets are biased towards easy queries and often fail to generate correct responses for challenging ones. To fix this, the authors propose a method that gives difficult queries more tries during training, making it easier to learn complex reasoning. They create new datasets that focus on hard problems and train different models on them. The results show that their approach outperforms previous methods on 6 mathematical benchmarks. |