Summary of Dualformer: Controllable Fast and Slow Thinking by Learning with Randomized Reasoning Traces, By Dijia Su et al.
Dualformer: Controllable Fast and Slow Thinking by Learning with Randomized Reasoning Traces
by DiJia Su, Sainbayar Sukhbaatar, Michael Rabbat, Yuandong Tian, Qinqing Zheng
First submitted to arxiv on: 13 Oct 2024
Categories
- Main: Artificial Intelligence (cs.AI)
- Secondary: Machine Learning (cs.LG); Logic in Computer Science (cs.LO)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary The paper introduces Dualformer, a Transformer model that integrates both fast and slow reasoning modes. Unlike previous models that solely rely on System 2 thinking, which requires high computational costs and slower response times, Dualformer seamlessly combines the two modes to enhance its reasoning capabilities. The model is trained on data with randomized reasoning traces, where specific dropping strategies are tailored according to the trace structure. At inference time, Dualformer can be configured to output solutions only (fast mode), both the reasoning chain and solution (slow mode), or automatically decide which mode to engage (auto mode). Experimental results show that Dualformer outperforms baseline models in both performance and computational efficiency, solving 30×30 maze navigation tasks optimally with a rate of 97.6% in slow mode and 80% in fast mode. The model also achieves improved performance on math problems when fine-tuning large language models. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary This paper creates a new AI model called Dualformer that can think fast or slow, just like humans do. Usually, AI models are either super fast but not very smart or really smart but take a long time to work out the answer. Dualformer is different because it combines both types of thinking in one model. It’s trained on special data that makes it learn how to skip over certain steps when solving problems. This lets it solve some tasks much faster and better than other models. In tests, Dualformer was able to solve mazes and math problems really well, even beating out other smart AI models. |
Keywords
» Artificial intelligence » Fine tuning » Inference » Transformer