Summary of Datadream: Few-shot Guided Dataset Generation, by Jae Myung Kim et al.
DataDream: Few-shot Guided Dataset Generation
by Jae Myung Kim, Jessica Bader, Stephan Alaniz, Cordelia Schmid, Zeynep Akata
First submitted to arxiv on: 15 Jul 2024
Categories
- Main: Computer Vision and Pattern Recognition (cs.CV)
- Secondary: Machine Learning (cs.LG)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary DataDream, a framework for synthesizing classification datasets, addresses the limitations of previous text-to-image diffusion models in generating fine-grained features and in-distribution images. By fine-tuning LoRA weights on few-shot examples, DataDream adapts an image generation model to generate training data that more faithfully represents the real data distribution. This approach improves downstream image classification performance, surpassing state-of-the-art accuracy with few-shot data across 7 out of 10 datasets, while being competitive on the other 3. The efficacy of DataDream is demonstrated through extensive experiments, providing insights into the impact of factors such as the number of real-shot and generated images, and fine-tuning compute on model performance. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary DataDream helps create better training data for image classifiers. It uses a special way to adjust an image generation model based on a few real examples of what you want it to generate. This creates more realistic fake training data that helps the classifier learn better. The new approach works really well, beating previous methods in 7 out of 10 cases. |
Keywords
» Artificial intelligence » Classification » Few shot » Fine tuning » Image classification » Image generation » Lora