Summary of Compute-constrained Data Selection, by Junjie Oscar Yin et al.

Compute-Constrained Data Selection

by Junjie Oscar Yin, Alexander M. Rush

First submitted to arxiv on: 21 Oct 2024

GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty	Written by	Summary
High	Paper authors	High Difficulty Summary Read the original abstract here
Medium	GrooveSquid.com (original content)	Medium Difficulty Summary The paper addresses the challenge of finetuning large language models (LLMs) under budget constraints by introducing a cost-aware utility function for data selection. It formulates the problem as a trade-off between initial-data-selection cost and training gain. The authors conduct experiments across various tasks, scaling finetuning tokens, model sizes, and data selection compute to explore the efficacy of different methods. Surprisingly, many powerful data selection methods are not computationally optimal, and cheaper alternatives dominate from both theoretical and empirical perspectives. For compute-optimal training, the paper finds that perplexity and gradient data selection require specific ratios of training-to-selection model size.
Low	GrooveSquid.com (original content)	Low Difficulty Summary This research explores ways to improve large language models without using too much computer power or money. The team created a special formula to choose the best data for fine-tuning these models. They tested different methods, scaling up or down to see what works best. What they found was interesting: many advanced techniques aren’t the most efficient when it comes to using computer resources. Instead, simpler approaches work just as well and are more cost-effective. This is important because it helps us make better use of our computing power and budget.

Keywords

* Artificial intelligence * Fine tuning * Perplexity

Compute-Constrained Data Selection

by Junjie Oscar Yin, Alexander M. Rush

Categories

GrooveSquid.com Paper Summaries

Keywords

Summary of Machine Learning Approaches For Mental Illness Detection on Social Media: a Systematic Review Of Biases and Methodological Challenges, by Yuchen Cao et al.

Summary of Comprehensive Benchmarking Of Large Language Models For Rna Secondary Structure Prediction, by L.i. Zablocki et al.

Related Posts