Loading Now

Summary of Inversecoder: Self-improving Instruction-tuned Code Llms with Inverse-instruct, by Yutong Wu et al.


InverseCoder: Self-improving Instruction-Tuned Code LLMs with Inverse-Instruct

by Yutong Wu, Di Huang, Wenxuan Shi, Wei Wang, Lingzhe Gao, Shihao Liu, Ziyuan Nan, Kaizhao Yuan, Rui Zhang, Xishan Zhang, Zidong Du, Qi Guo, Yewen Pu, Dawei Yin, Xing Hu, Yunji Chen

First submitted to arxiv on: 8 Jul 2024

Categories

  • Main: Computation and Language (cs.CL)
  • Secondary: Artificial Intelligence (cs.AI); Software Engineering (cs.SE)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
Recent advancements in open-source code large language models (LLMs) have been driven by fine-tuning on data generated from powerful closed-source LLMs, which are expensive to obtain. This paper investigates whether a fine-tuned open-source model can be used to generate additional data for its instruction-tuning dataset. The authors propose Inverse-Instruct, a data augmentation technique that leverages a fine-tuned LLM to generate additional instructions and code responses from its own training dataset. By adding these pairs to the original dataset, a stronger code LLM can be obtained through fine-tuning. Empirical validation on open-source code models (e.g., CodeLlama-Python and DeepSeek-Coder) and benchmarks (e.g., HumanEval(+), MBPP(+), DS-1000, and MultiPL-E) shows that Inverse-Instruct consistently improves the base models.
Low GrooveSquid.com (original content) Low Difficulty Summary
This paper explores a way to improve open-source code language models. Currently, these models are trained using data from more powerful closed-source models that are expensive to access. The authors ask if we can use these open-source models to generate new data and make them better. They propose a technique called Inverse-Instruct, which uses the model to create new instructions and code responses based on its own training data. By adding these new pairs to the original dataset, they can fine-tune the model again and make it even stronger. The authors test this idea with several models and benchmarks and show that it works well.

Keywords

» Artificial intelligence  » Data augmentation  » Fine tuning  » Instruction tuning