Summary of Edgefusion: On-device Text-to-image Generation, by Thibault Castells et al.
EdgeFusion: On-Device Text-to-Image Generation
by Thibault Castells, Hyoung-Kyu Song, Tairen Piao, Shinkook Choi, Bo-Kyeong Kim, Hanyoung Yim, Changgwun Lee, Jae Gon Kim, Tae-Ho Kim
First submitted to arxiv on: 18 Apr 2024
Categories
- Main: Machine Learning (cs.LG)
- Secondary: Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary This paper tackles the limitations of Stable Diffusion (SD) for text-to-image generation by developing a compact SD variant, BK-SDM. Researchers focus on reducing sampling steps and architectural optimizations like pruning and knowledge distillation to accelerate the process. The study introduces two strategies: leveraging high-quality image-text pairs from generative models and designing an advanced distillation process. By exploring quantization, profiling, and on-device deployment, the team achieves photo-realistic images in just two steps with latency under one second on resource-limited devices. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary This paper solves a problem that makes it hard to use Stable Diffusion for creating pictures from text. They make a special version of SD that uses less computer power. People are trying different ways to speed up this process, like using old models and learning from them. The researchers in this study try two new ideas: using great image-text pairs and making a better way to learn from old models. By doing some extra work on the computer code, they can make pictures really fast – under one second! This is important because it means we can use Stable Diffusion on devices that don’t have super powerful computers. |
Keywords
» Artificial intelligence » Diffusion » Distillation » Image generation » Knowledge distillation » Pruning » Quantization