Loading Now

Summary of Parameter Efficient Diverse Paraphrase Generation Using Sequence-level Knowledge Distillation, by Lasal Jayawardena and Prasan Yapa


Parameter Efficient Diverse Paraphrase Generation Using Sequence-Level Knowledge Distillation

by Lasal Jayawardena, Prasan Yapa

First submitted to arxiv on: 19 Apr 2024

Categories

  • Main: Computation and Language (cs.CL)
  • Secondary: Artificial Intelligence (cs.AI); Machine Learning (cs.LG)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
The paper explores the application of Large Language Models (LLMs) in domain-specific Natural Language Generation (NLG) tasks, specifically paraphrasing. While LLMs have shown excellent performance in various domains, their use in commercial settings is hindered by high computational costs and long inference times. To address this challenge, the authors develop three distinct models using sequence-level knowledge distillation, which enable faster inference times while maintaining comparable quality to the original LLM-generated paraphrases. The distilled models exhibit syntactic diversity and lexical diversity, a unique feature not typically observed in neural-based approaches. Human evaluation shows only a 4% drop in performance compared to the LLM teacher model used in the distillation process, despite being 1000 times smaller.
Low GrooveSquid.com (original content) Low Difficulty Summary
The paper is about using special language models to make writing tasks easier and faster. Right now, these powerful models are not very practical because they need a lot of computer power and take a long time to produce results. The researchers found a way to make the models smaller and faster while still keeping the good qualities. They did this by teaching the models to learn from other models. This new approach helps us generate many different versions of text that are similar but not exactly the same, which is important for tasks like rewriting texts in different styles.

Keywords

* Artificial intelligence  * Distillation  * Inference  * Knowledge distillation  * Teacher model