Summary of Dualformer: Controllable Fast and Slow Thinking by Learning with Randomized Reasoning Traces, By Dijia Su et al.

Dualformer: Controllable Fast and Slow Thinking by Learning with Randomized Reasoning Traces

by DiJia Su, Sainbayar Sukhbaatar, Michael Rabbat, Yuandong Tian, Qinqing Zheng

First submitted to arxiv on: 13 Oct 2024

GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty	Written by	Summary
High	Paper authors	High Difficulty Summary Read the original abstract here
Medium	GrooveSquid.com (original content)	Medium Difficulty Summary The paper introduces Dualformer, a Transformer model that integrates both fast and slow reasoning modes. Unlike previous models that solely rely on System 2 thinking, which requires high computational costs and slower response times, Dualformer seamlessly combines the two modes to enhance its reasoning capabilities. The model is trained on data with randomized reasoning traces, where specific dropping strategies are tailored according to the trace structure. At inference time, Dualformer can be configured to output solutions only (fast mode), both the reasoning chain and solution (slow mode), or automatically decide which mode to engage (auto mode). Experimental results show that Dualformer outperforms baseline models in both performance and computational efficiency, solving 30×30 maze navigation tasks optimally with a rate of 97.6% in slow mode and 80% in fast mode. The model also achieves improved performance on math problems when fine-tuning large language models.
Low	GrooveSquid.com (original content)	Low Difficulty Summary This paper creates a new AI model called Dualformer that can think fast or slow, just like humans do. Usually, AI models are either super fast but not very smart or really smart but take a long time to work out the answer. Dualformer is different because it combines both types of thinking in one model. It’s trained on special data that makes it learn how to skip over certain steps when solving problems. This lets it solve some tasks much faster and better than other models. In tests, Dualformer was able to solve mazes and math problems really well, even beating out other smart AI models.

Keywords

» Artificial intelligence » Fine tuning » Inference » Transformer

Dualformer: Controllable Fast and Slow Thinking by Learning with Randomized Reasoning Traces

by DiJia Su, Sainbayar Sukhbaatar, Michael Rabbat, Yuandong Tian, Qinqing Zheng

Categories

GrooveSquid.com Paper Summaries

Keywords

Summary of Bidora: Bi-level Optimization-based Weight-decomposed Low-rank Adaptation, by Peijia Qin et al.

Summary of Generalized Group Data Attribution, by Dan Ley et al.

Related Posts