Summary of Shed: Shapley-based Automated Dataset Refinement For Instruction Fine-tuning, by Yexiao He and Ziyao Wang and Zheyu Shen and Guoheng Sun and Yucong Dai and Yongkai Wu and Hongyi Wang and Ang Li
SHED: Shapley-Based Automated Dataset Refinement for Instruction Fine-Tuning
by Yexiao He, Ziyao Wang, Zheyu Shen, Guoheng Sun, Yucong Dai, Yongkai Wu, Hongyi Wang, Ang Li
First submitted to arxiv on: 23 Apr 2024
Categories
- Main: Computation and Language (cs.CL)
- Secondary: Machine Learning (cs.LG)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary The paper introduces SHED, an automated dataset refinement framework based on Shapley value for instruction fine-tuning. It leverages pre-trained Large Language Models (LLMs) to adapt to various downstream tasks and human preferences. By identifying high-quality data from vast datasets, SHED eliminates the need for human intervention or commercial LLMs. The refined datasets exhibit transferability, achieving consistently high performance across different LLMs. Extensive experiments demonstrate SHED’s superiority over state-of-the-art methods in various tasks. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary The paper is about a new way to make computer programs better by using less data. Right now, these programs can be trained on huge amounts of information, but most of it might not even matter. The researchers developed a system called SHED that helps find the important parts and removes the rest. This makes the training process faster and more effective. They tested their method with many different types of tasks and found that it outperformed other approaches. |
Keywords
» Artificial intelligence » Fine tuning » Transferability