Summary of How Much Is a Noisy Image Worth? Data Scaling Laws For Ambient Diffusion, by Giannis Daras and Yeshwanth Cherapanamjeri and Constantinos Daskalakis

How much is a noisy image worth? Data Scaling Laws for Ambient Diffusion

by Giannis Daras, Yeshwanth Cherapanamjeri, Constantinos Daskalakis

First submitted to arxiv on: 5 Nov 2024

GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty	Written by	Summary
High	Paper authors	High Difficulty Summary Read the original abstract here
Medium	GrooveSquid.com (original content)	Medium Difficulty Summary This paper investigates the quality of generative models and their dependence on the quality of training data. The authors study the performance of diffusion models trained on corrupted datasets versus those trained on clean data, showing that even with large-scale datasets, it is impossible to match the performance of models trained on clean data when only using noisy data. However, they find that combining a small amount of clean data with a larger set of noisy data can achieve near state-of-the-art performance. The paper provides theoretical evidence for these findings by developing novel sample complexity bounds for learning from Gaussian Mixtures with heterogeneous variances.
Low	GrooveSquid.com (original content)	Low Difficulty Summary This research looks at how good generative models are and how they depend on the quality of their training data. The scientists found that even when they had lots of data, if it was all noisy, they couldn’t make the model as good as one trained on clean data. But they did find that mixing a little bit of clean data with lots of noisy data could make the model almost as good as one trained solely on clean data.

Keywords

* Artificial intelligence * Diffusion

How much is a noisy image worth? Data Scaling Laws for Ambient Diffusion

by Giannis Daras, Yeshwanth Cherapanamjeri, Constantinos Daskalakis

Categories

GrooveSquid.com Paper Summaries

Keywords

Summary of A Bayesian Explanation Of Machine Learning Models Based on Modes and Functional Anova, by Quan Long

Summary of Generalization and Risk Bounds For Recurrent Neural Networks, by Xuewei Cheng and Ke Huang and Shujie Ma

Related Posts