Summary of Data Attribution For Diffusion Models: Timestep-induced Bias in Influence Estimation, by Tong Xie et al.

Data Attribution for Diffusion Models: Timestep-induced Bias in Influence Estimation

by Tong Xie, Haoyu Li, Andrew Bai, Cho-Jui Hsieh

First submitted to arxiv on: 17 Jan 2024

GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty	Written by	Summary
High	Paper authors	High Difficulty Summary Read the original abstract here
Medium	GrooveSquid.com (original content)	Medium Difficulty Summary In this paper, researchers develop novel data attribution methods for better understanding black-box neural networks, focusing on diffusion models. Unlike previous work, which established links between model output and training data in various settings, this study explores the relationship between diffusion model outputs and training samples over a sequence of timesteps. The authors introduce Diffusion-TracIn, which accounts for temporal dynamics, and observe that loss gradient norms are highly dependent on timestep. This leads to bias in influence estimation, particularly for large-norm-inducing timesteps. To mitigate this effect, the researchers propose Diffusion-ReTrac, a re-normalized adaptation that enables targeted influence measurement and more intuitive visualization. The approach is demonstrated through various evaluation metrics and auxiliary tasks, reducing generally influential samples by one-third.
Low	GrooveSquid.com (original content)	Low Difficulty Summary This paper creates new methods to understand how neural networks work, especially for complex models called diffusion models. Normally, these models are hard to understand because they take many steps to produce their output. The researchers found that if you look at the output of a diffusion model at each step, you can see which parts of the training data are most important. This is helpful because it lets us see which parts of the data are causing the model’s behavior. The authors also developed a way to use this information to make the model more intuitive and easier to understand.

Keywords

* Artificial intelligence * Diffusion * Diffusion model

Data Attribution for Diffusion Models: Timestep-induced Bias in Influence Estimation

by Tong Xie, Haoyu Li, Andrew Bai, Cho-Jui Hsieh

Categories

GrooveSquid.com Paper Summaries

Keywords

Summary of Attack and Reset For Unlearning: Exploiting Adversarial Noise Toward Machine Unlearning Through Parameter Re-initialization, by Yoonhwa Jung and Ikhyun Cho and Shun-hsiang Hsu and Julia Hockenmaier

Summary of An Optimal Transport Approach For Computing Adversarial Training Lower Bounds in Multiclass Classification, by Nicolas Garcia Trillos et al.

Related Posts