Summary of Infusion: Preventing Customized Text-to-image Diffusion From Overfitting, by Weili Zeng et al.

Infusion: Preventing Customized Text-to-Image Diffusion from Overfitting

by Weili Zeng, Yichao Yan, Qi Zhu, Zhuo Chen, Pengzhi Chu, Weiming Zhao, Xiaokang Yang

First submitted to arxiv on: 22 Apr 2024

GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty	Written by	Summary
High	Paper authors	High Difficulty Summary Read the original abstract here
Medium	GrooveSquid.com (original content)	Medium Difficulty Summary Text-to-image (T2I) customization aims to create images matching specific textual descriptions, but existing works face a main challenge: concept overfitting. Our paper analyzes this issue, categorizing it into two types: concept-agnostic overfitting, which undermines non-customized concept knowledge, and concept-specific overfitting, confined to limited modalities like backgrounds or styles. To evaluate the degree of overfitting, we introduce two metrics: Latent Fisher divergence and Wasserstein metric. Our proposed Infusion method learns target concepts without being constrained by limited training modalities while preserving non-customized knowledge. This approach requires only 11KB of trained parameters and outperforms state-of-the-art methods in single and multi-concept customized generation.
Low	GrooveSquid.com (original content)	Low Difficulty Summary Imagine you want a picture that looks like what someone is describing to you. But, current computer programs struggle to make these pictures because they get too good at one specific type of picture and forget how to make other types. Our solution, called Infusion, helps the program learn about different types of pictures without getting stuck on just one. This makes our approach better than others at making customized pictures that match what someone is describing.

Keywords

* Artificial intelligence * Overfitting

Infusion: Preventing Customized Text-to-Image Diffusion from Overfitting

by Weili Zeng, Yichao Yan, Qi Zhu, Zhuo Chen, Pengzhi Chu, Weiming Zhao, Xiaokang Yang

Categories

GrooveSquid.com Paper Summaries

Keywords

Summary of Masked Latent Transformer with the Random Masking Ratio to Advance the Diagnosis Of Dental Fluorosis, by Yun Wu and Hao Xu and Maohua Gu and Zhongchuan Jiang and Jun Xu and Youliang Tian

Summary of Unlawful Proxy Discrimination: a Framework For Challenging Inherently Discriminatory Algorithms, by Hilde Weerts et al.

Related Posts