Summary of Don’t Half-listen: Capturing Key-part Information in Continual Instruction Tuning, by Yongquan He and Xuancheng Huang and Minghao Tang and Lingxun Meng and Xiang Li and Wei Lin and Wenyuan Zhang and Yifu Gao
Don’t Half-listen: Capturing Key-part Information in Continual Instruction Tuning
by Yongquan He, Xuancheng Huang, Minghao Tang, Lingxun Meng, Xiang Li, Wei Lin, Wenyuan Zhang, Yifu Gao
First submitted to arxiv on: 15 Mar 2024
Categories
- Main: Computation and Language (cs.CL)
- Secondary: Artificial Intelligence (cs.AI)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary The proposed novel continual instruction tuning (CIT) method for large language models (LLMs) addresses the catastrophic forgetting (CF) problem by dynamically replaying data to refine the training objective, enabling LLMs to capture task-aware information. This approach computes information gain on masked parts and alleviates overfitting to general descriptions in instructions. The CIT method also introduces two metrics, P-score and V-score, to measure the generalization and instruction-following abilities of LLMs. Experimental results show that the proposed method outperforms other methods on both seen and held-out tasks. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary Large language models can be trained to produce results consistent with human goals in specific tasks. However, this process can cause previously learned abilities to degrade over time. To solve this problem, a new way of continually training these models is proposed. This method uses information gain to replay data and refine the training objective, allowing the model to capture important details. Two new metrics are also introduced to measure how well the model generalizes and follows instructions. The results show that this approach works better than other methods on both familiar and unfamiliar tasks. |
Keywords
» Artificial intelligence » Generalization » Instruction tuning » Overfitting