Summary of Taming Mode Collapse in Score Distillation For Text-to-3d Generation, by Peihao Wang et al.

Taming Mode Collapse in Score Distillation for Text-to-3D Generation

by Peihao Wang, Dejia Xu, Zhiwen Fan, Dilin Wang, Sreyas Mohan, Forrest Iandola, Rakesh Ranjan, Yilei Li, Qiang Liu, Zhangyang Wang, Vikas Chandra

First submitted to arxiv on: 31 Dec 2023

GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty	Written by	Summary
High	Paper authors	High Difficulty Summary Read the original abstract here
Medium	GrooveSquid.com (original content)	Medium Difficulty Summary This paper proposes a new approach to text-to-3D generation using score distillation, which suffers from view inconsistency issues known as the “Janus” artifact. The authors reveal that existing methods degenerate into maximal likelihood seeking on each view independently, leading to mode collapse and Janus artifacts. To address this issue, they introduce Entropic Score Distillation (ESD), a new objective function that encourages diversity among different views by maximizing entropy. ESD can be implemented using the classifier-free guidance trick upon variational score distillation. The authors demonstrate the effectiveness of ESD in mitigating Janus artifacts through extensive experiments.
Low	GrooveSquid.com (original content)	Low Difficulty Summary This paper helps solve a problem with making 3D pictures from text. Right now, this process is not perfect and often shows the same front face on different views. Researchers have tried to fix this issue by changing how they create scores for the generated images. However, they haven’t understood why it works or what’s going wrong. This paper figures out that the problem is caused by the way current methods try to make each view look perfect. To solve this, the authors develop a new method called Entropic Score Distillation (ESD). ESD helps create more diverse and realistic 3D images.

Keywords

* Artificial intelligence * Distillation * Likelihood * Objective function

Taming Mode Collapse in Score Distillation for Text-to-3D Generation

by Peihao Wang, Dejia Xu, Zhiwen Fan, Dilin Wang, Sreyas Mohan, Forrest Iandola, Rakesh Ranjan, Yilei Li, Qiang Liu, Zhangyang Wang, Vikas Chandra

Categories

GrooveSquid.com Paper Summaries

Keywords

Summary of Unifying Self-supervised Clustering and Energy-based Models, by Emanuele Sansone and Robin Manhaeve

Summary of Constrained Online Two-stage Stochastic Optimization: Algorithm with (and Without) Predictions, by Piao Hu et al.

Related Posts