Summary of Diffusion Soup: Model Merging For Text-to-image Diffusion Models, by Benjamin Biggs et al.
Diffusion Soup: Model Merging for Text-to-Image Diffusion Models
by Benjamin Biggs, Arjun Seshadri, Yang Zou, Achin Jain, Aditya Golatkar, Yusheng Xie, Alessandro Achille, Ashwin Swaminathan, Stefano Soatto
First submitted to arxiv on: 12 Jun 2024
Categories
- Main: Computer Vision and Pattern Recognition (cs.CV)
- Secondary: Artificial Intelligence (cs.AI); Cryptography and Security (cs.CR); Machine Learning (cs.LG)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary We present Diffusion Soup, a novel compartmentalization method for Text-to-Image Generation that averages the weights of diffusion models trained on sharded data. This approach enables training-free continual learning and unlearning without additional memory or inference costs. Our method samples from a point in weight space that approximates the geometric mean of the distributions of constituent datasets, offering anti-memorization guarantees and zero-shot style mixing capabilities. Empirically, Diffusion Soup outperforms a paragon model trained on the union of all data shards, achieving significant improvements in Image Reward (30%), IR (59%), and TIFA score (up to 1.3%). We demonstrate robust unlearning and validate theoretical insights using real data. Finally, we showcase Diffusion Soup’s ability to blend distinct styles of models finetuned on different shards, resulting in zero-shot generation of hybrid styles. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary Imagine being able to generate new images by mixing together the styles of different images. This is what our team has achieved with a new method called Diffusion Soup. We took existing image generation models and combined their strengths in a special way. This allows us to generate new images that are unique and blend the styles of different pictures. Our approach also enables us to remove or add new information without needing to retrain the entire model. We tested our method on various datasets and found that it outperforms other methods, generating high-quality images with desired styles. We’re excited about the possibilities this could bring in areas like art and design. |
Keywords
» Artificial intelligence » Continual learning » Diffusion » Image generation » Inference » Zero shot