Summary of Exploring the Impact Of Dataset Bias on Dataset Distillation, by Yao Lu et al.

Exploring the Impact of Dataset Bias on Dataset Distillation

by Yao Lu, Jianyang Gu, Xuguang Chen, Saeed Vahidian, Qi Xuan

First submitted to arxiv on: 24 Mar 2024

GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty	Written by	Summary
High	Paper authors	High Difficulty Summary Read the original abstract here
Medium	GrooveSquid.com (original content)	Medium Difficulty Summary This paper proposes a new direction in dataset distillation (DD), which synthesizes smaller datasets that preserve essential information from the original large-scale ones. The authors investigate how biases in these original datasets impact DD, as current methods assume unbiased data. They construct two biased datasets and use existing DD methods to generate synthetic datasets on these datasets. The results show that biases significantly affect synthetic dataset performance, highlighting the need to identify and mitigate biases during DD. The paper reformulates DD within a biased dataset context.
Low	GrooveSquid.com (original content)	Low Difficulty Summary This research focuses on making large-scale datasets more manageable by creating smaller versions that capture important information. The scientists studied how problems with these original datasets can affect this process. They created two datasets with intentional errors and used existing methods to make new, smaller datasets based on these flawed datasets. The results show that the issues in the original data greatly impact the quality of the synthetic datasets. This highlights the importance of fixing flaws in the original data before creating smaller versions.

Keywords

* Artificial intelligence * Distillation

Exploring the Impact of Dataset Bias on Dataset Distillation

by Yao Lu, Jianyang Gu, Xuguang Chen, Saeed Vahidian, Qi Xuan

Categories

GrooveSquid.com Paper Summaries

Keywords

Summary of Carbon Intensity-aware Adaptive Inference Of Dnns, by Jiwan Jung

Summary of A Multi-label Dataset Of French Fake News: Human and Machine Insights, by Benjamin Icard et al.

Related Posts