Summary of Semi-supervised Fine-tuning For Large Language Models, by Junyu Luo et al.
Semi-supervised Fine-tuning for Large Language Models
by Junyu Luo, Xiao Luo, Xiusi Chen, Zhiping Xiao, Wei Ju, Ming Zhang
First submitted to arxiv on: 17 Oct 2024
Categories
- Main: Computation and Language (cs.CL)
- Secondary: Artificial Intelligence (cs.AI)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary Supervised fine-tuning (SFT) is crucial for adapting large language models (LLMs) to specific domains or tasks. However, limited labeled data availability poses a significant challenge for SFT. To address this, we introduce SemiEvol, a semi-supervised fine-tuning framework that leverages both labeled and unlabeled data for LLM alignment. This framework adopts a bi-level approach, propagating knowledge from labeled to unlabeled data through in-weight and in-context methods. Additionally, it incorporates a collaborative learning mechanism to select higher-quality pseudo-response samples. We evaluated SemiEvol on seven general or domain-specific datasets using GPT-4o-mini and Llama-3.1, demonstrating significant improvements in model performance on target data. Compared to SFT and self-evolution methods, our framework shows practicality in hybrid data scenarios. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary Imagine you have a super smart computer program that can understand and generate human-like language, but it needs some extra help to learn about specific topics or tasks. This is like trying to teach a child who only knows a little bit of English how to talk about their favorite sport. We created a way for this program to learn more by using both labeled (correct answers) and unlabeled data. Our new method, called SemiEvol, makes the computer program smarter and better at understanding language related to specific topics or tasks. |
Keywords
» Artificial intelligence » Alignment » Fine tuning » Gpt » Llama » Semi supervised » Supervised