Summary of Shed: Shapley-based Automated Dataset Refinement For Instruction Fine-tuning, by Yexiao He and Ziyao Wang and Zheyu Shen and Guoheng Sun and Yucong Dai and Yongkai Wu and Hongyi Wang and Ang Li

by Yexiao He, Ziyao Wang, Zheyu Shen, Guoheng Sun, Yucong Dai, Yongkai Wu, Hongyi Wang, Ang Li

First submitted to arxiv on: 23 Apr 2024

GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty	Written by	Summary
High	Paper authors	High Difficulty Summary Read the original abstract here
Medium	GrooveSquid.com (original content)	Medium Difficulty Summary The paper introduces SHED, an automated dataset refinement framework based on Shapley value for instruction fine-tuning. It leverages pre-trained Large Language Models (LLMs) to adapt to various downstream tasks and human preferences. By identifying high-quality data from vast datasets, SHED eliminates the need for human intervention or commercial LLMs. The refined datasets exhibit transferability, achieving consistently high performance across different LLMs. Extensive experiments demonstrate SHED’s superiority over state-of-the-art methods in various tasks.
Low	GrooveSquid.com (original content)	Low Difficulty Summary The paper is about a new way to make computer programs better by using less data. Right now, these programs can be trained on huge amounts of information, but most of it might not even matter. The researchers developed a system called SHED that helps find the important parts and removes the rest. This makes the training process faster and more effective. They tested their method with many different types of tasks and found that it outperformed other approaches.

Keywords

» Artificial intelligence » Fine tuning » Transferability

Summary of Shed: Shapley-based Automated Dataset Refinement For Instruction Fine-tuning, by Yexiao He and Ziyao Wang and Zheyu Shen and Guoheng Sun and Yucong Dai and Yongkai Wu and Hongyi Wang and Ang Li

SHED: Shapley-Based Automated Dataset Refinement for Instruction Fine-Tuning

by Yexiao He, Ziyao Wang, Zheyu Shen, Guoheng Sun, Yucong Dai, Yongkai Wu, Hongyi Wang, Ang Li

Categories

GrooveSquid.com Paper Summaries

Keywords

SHED: Shapley-Based Automated Dataset Refinement for Instruction Fine-Tuning

by Yexiao He, Ziyao Wang, Zheyu Shen, Guoheng Sun, Yucong Dai, Yongkai Wu, Hongyi Wang, Ang Li

Categories

GrooveSquid.com Paper Summaries

Keywords

Summary of Discovering Robust Biomarkers Of Psychiatric Disorders From Resting-state Functional Mri Via Graph Neural Networks: a Systematic Review, by Yi Hao Chan et al.

Summary of Modeling Caption Diversity in Contrastive Vision-language Pretraining, by Samuel Lavoie et al.

Related Posts