Loading Now

Summary of Beyond Imitation: Learning Key Reasoning Steps From Dual Chain-of-thoughts in Reasoning Distillation, by Chengwei Dai et al.


Beyond Imitation: Learning Key Reasoning Steps from Dual Chain-of-Thoughts in Reasoning Distillation

by Chengwei Dai, Kun Li, Wei Zhou, Songlin Hu

First submitted to arxiv on: 30 May 2024

Categories

  • Main: Computation and Language (cs.CL)
  • Secondary: Artificial Intelligence (cs.AI)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
This paper presents a novel approach to distilling the capabilities of Large Language Models (LLMs) into smaller, more compact models. The authors identify that Chain-of-Thoughts (CoTs) reasoning in LLMs primarily consists of simple forms and key steps that significantly impact conclusions. Previous methods for distillation involved supervised fine-tuning student models on correct CoTs data produced by teacher LLMs, resulting in students struggling to learn the key reasoning steps. To address this issue, the authors propose a method called EDIT (mistakE-DriVen key reasoining step distillaTion) that aids student models learning key reasoning steps rather than mere simple fine-tuning. The approach involves generating dual CoTs data with similar reasoning paths but divergent conclusions, and then applying the minimum edit distance algorithm to locate the key steps and optimize their likelihood. Extensive experiments validate the effectiveness of EDIT across both in-domain and out-of-domain benchmark reasoning datasets.
Low GrooveSquid.com (original content) Low Difficulty Summary
This paper is about making big language models smaller and more efficient without losing their smart thinking abilities. The authors found that these models mostly use simple rules to make decisions, but some key steps are really important for getting the right answers. Before, people tried to teach smaller models by giving them examples of correct thinking, but this didn’t work well because the models just copied what they saw instead of learning how to think correctly. To fix this problem, the authors created a new way to teach smaller models called EDIT. They use special prompts to get the model to think about its mistakes and learn from them. This helps the model become better at making decisions and thinking critically.

Keywords

» Artificial intelligence  » Distillation  » Fine tuning  » Likelihood  » Supervised