Summary of Smalltolarge (s2l): Scalable Data Selection For Fine-tuning Large Language Models by Summarizing Training Trajectories Of Small Models, By Yu Yang et al.

SmallToLarge (S2L): Scalable Data Selection for Fine-tuning Large Language Models by Summarizing Training Trajectories of Small Models

by Yu Yang, Siddhartha Mishra, Jeffrey N Chiang, Baharan Mirzasoleiman

First submitted to arxiv on: 12 Mar 2024

GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty	Written by	Summary
High	Paper authors	High Difficulty Summary Read the original abstract here
Medium	GrooveSquid.com (original content)	Medium Difficulty Summary An effective and scalable data selection method for supervised fine-tuning (SFT) is introduced in this paper. The SmallToLarge (S2L) algorithm leverages training trajectories from small models to guide data selection for larger models, improving data efficiency in SFT for specialized domains such as mathematical problem-solving and clinical text summarization. Through extensive experiments, S2L significantly outperforms state-of-the-art data selection algorithms, reducing the training data required to match full dataset performance while achieving improved accuracy on challenging benchmarks like MATH and Phi-2. The algorithm’s ability to perform data selection using a reference model 40x smaller than the target model proportionally reduces the cost of data selection.
Low	GrooveSquid.com (original content)	Low Difficulty Summary This paper introduces a new way to pick the right data for training language models. The method, called SmallToLarge (S2L), helps train larger models by learning from smaller models that have already been trained on less data. This approach makes it possible to achieve the same level of performance as training on all the available data, but using much less data. The results show that S2L outperforms other methods in various tasks, such as math problems and summarizing medical texts.

Keywords

* Artificial intelligence * Fine tuning * Summarization * Supervised

SmallToLarge (S2L): Scalable Data Selection for Fine-tuning Large Language Models by Summarizing Training Trajectories of Small Models

by Yu Yang, Siddhartha Mishra, Jeffrey N Chiang, Baharan Mirzasoleiman

Categories

GrooveSquid.com Paper Summaries

Keywords

Summary of Graph Data Condensation Via Self-expressive Graph Structure Reconstruction, by Zhanyu Liu et al.

Summary of On the Nonconvexity Of Some Push-forward Constraints and Its Consequences in Machine Learning, by Lucas De Lara (ut3 et al.

Related Posts