Summary of Distributional Dataset Distillation with Subtask Decomposition, by Tian Qin et al.
Distributional Dataset Distillation with Subtask Decomposition
by Tian Qin, Zhiwei Deng, David Alvarez-Melis
First submitted to arxiv on: 1 Mar 2024
Categories
- Main: Machine Learning (cs.LG)
- Secondary: None
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary The paper proposes a new approach to dataset distillation, which aims to compress large datasets into a small set of input-label pairs called prototypes. The key observation is that existing methods that use explicit prototypes can be suboptimal and incur unexpected storage costs from distilled labels. To address this issue, the authors propose Distributional Dataset Distillation (D3), which encodes the data using minimal sufficient per-class statistics and a decoder to distill the dataset into a compact distributional representation. The method is more memory-efficient compared to prototype-based methods. To scale up the process of learning these representations, the authors propose Federated Distillation, which decomposes the dataset into subsets, distills them in parallel using sub-task experts, and then re-aggregates them. The algorithm is evaluated on a three-dimensional metric and achieves state-of-the-art results on TinyImageNet and ImageNet-1K, outperforming prior art by 6.9% under a storage budget of 2 images per class. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary The paper introduces a new way to shrink big datasets into small ones. This helps reduce the amount of memory needed to store data. The method is called Distributional Dataset Distillation (D3). It uses simple statistics and a special tool to convert data into a smaller form. This makes it more efficient than other methods that try to do the same thing. To make this process faster, the authors suggest breaking down the dataset into smaller parts, processing them separately, and then combining the results. The method is tested on two big datasets and performs better than previous attempts. |
Keywords
* Artificial intelligence * Decoder * Distillation