Loading Now

Summary of Distillation Contrastive Decoding: Improving Llms Reasoning with Contrastive Decoding and Distillation, by Phuc Phan et al.


Distillation Contrastive Decoding: Improving LLMs Reasoning with Contrastive Decoding and Distillation

by Phuc Phan, Hieu Tran, Long Phan

First submitted to arxiv on: 21 Feb 2024

Categories

  • Main: Computation and Language (cs.CL)
  • Secondary: Artificial Intelligence (cs.AI); Machine Learning (cs.LG)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
We propose Distillation Contrastive Decoding (DCD), an approach to enhance the reasoning capabilities of Large Language Models (LLMs) during inference. Unlike previous methods that relied on smaller models or analyzing hidden state differences, DCD employs Contrastive Chain-of-thought Prompting and advanced distillation techniques like Dropout and Quantization. This method addresses the limitations of Contrastive Decoding (CD), which typically requires both an expert and an amateur model, increasing computational demands. By integrating contrastive prompts with distillation, DCD eliminates the need for an amateur model and reduces memory usage. Our evaluations show that DCD significantly enhances LLM performance across various reasoning benchmarks, outperforming CD and existing methods on GSM8K and StrategyQA datasets.
Low GrooveSquid.com (original content) Low Difficulty Summary
We want to make Large Language Models (LLMs) better at thinking critically. To do this, we created a new way called Distillation Contrastive Decoding (DCD). It’s different from other approaches that needed smaller models or special computer programs. Our method uses advanced techniques to teach LLMs how to reason better. This makes it more efficient and effective than previous methods. We tested DCD on various reasoning tasks and found that it significantly improves LLM performance, beating existing methods.

Keywords

* Artificial intelligence  * Distillation  * Dropout  * Inference  * Prompting  * Quantization