Summary of Inversecoder: Self-improving Instruction-tuned Code Llms with Inverse-instruct, by Yutong Wu et al.
InverseCoder: Self-improving Instruction-Tuned Code LLMs with Inverse-Instruct
by Yutong Wu, Di Huang, Wenxuan Shi, Wei Wang, Lingzhe Gao, Shihao Liu, Ziyuan Nan, Kaizhao Yuan, Rui Zhang, Xishan Zhang, Zidong Du, Qi Guo, Yewen Pu, Dawei Yin, Xing Hu, Yunji Chen
First submitted to arxiv on: 8 Jul 2024
Categories
- Main: Computation and Language (cs.CL)
- Secondary: Artificial Intelligence (cs.AI); Software Engineering (cs.SE)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary Recent advancements in open-source code large language models (LLMs) have been driven by fine-tuning on data generated from powerful closed-source LLMs, which are expensive to obtain. This paper investigates whether a fine-tuned open-source model can be used to generate additional data for its instruction-tuning dataset. The authors propose Inverse-Instruct, a data augmentation technique that leverages a fine-tuned LLM to generate additional instructions and code responses from its own training dataset. By adding these pairs to the original dataset, a stronger code LLM can be obtained through fine-tuning. Empirical validation on open-source code models (e.g., CodeLlama-Python and DeepSeek-Coder) and benchmarks (e.g., HumanEval(+), MBPP(+), DS-1000, and MultiPL-E) shows that Inverse-Instruct consistently improves the base models. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary This paper explores a way to improve open-source code language models. Currently, these models are trained using data from more powerful closed-source models that are expensive to access. The authors ask if we can use these open-source models to generate new data and make them better. They propose a technique called Inverse-Instruct, which uses the model to create new instructions and code responses based on its own training data. By adding these new pairs to the original dataset, they can fine-tune the model again and make it even stronger. The authors test this idea with several models and benchmarks and show that it works well. |
Keywords
» Artificial intelligence » Data augmentation » Fine tuning » Instruction tuning