Summary of Disco-diff: Enhancing Continuous Diffusion Models with Discrete Latents, by Yilun Xu et al.

DisCo-Diff: Enhancing Continuous Diffusion Models with Discrete Latents

by Yilun Xu, Gabriele Corso, Tommi Jaakkola, Arash Vahdat, Karsten Kreis

First submitted to arxiv on: 3 Jul 2024

GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty	Written by	Summary
High	Paper authors	High Difficulty Summary Read the original abstract here
Medium	GrooveSquid.com (original content)	Medium Difficulty Summary Diffusion models (DMs) have significantly impacted generative learning by encoding data into a simple Gaussian distribution through a diffusion process. However, this approach may be unnecessarily complex for capturing multimodal data distributions. To simplify this challenge, we propose Discrete-Continuous Latent Variable Diffusion Models (DisCo-Diff), which introduces complementary discrete latent variables to augment DMs. These latents are inferred using an encoder and trained end-to-end with the DM. Unlike pre-trained networks, DisCo-Diff is universally applicable without relying on prior knowledge. The proposed framework reduces the complexity of learning the DM’s noise-to-data mapping by introducing fewer discrete variables with small codebooks. We validate DisCo-Diff on various image synthesis tasks, molecular docking, and toy data, finding that it consistently outperforms baseline models in terms of FID scores.
Low	GrooveSquid.com (original content)	Low Difficulty Summary This research paper is about a new way to improve generative learning using something called diffusion models. The current method for encoding data into a simple distribution can be complicated when dealing with multiple types of data. To make things easier, the researchers propose a new approach that adds extra layers of information to the data. These additional layers are like codes that help the model understand what’s important and what’s not. The new approach is simpler and more flexible than the old method and can work on different types of data, including images and molecules. The results show that this new approach performs better than the old one in certain situations.

Keywords

» Artificial intelligence » Diffusion » Encoder » Image synthesis

DisCo-Diff: Enhancing Continuous Diffusion Models with Discrete Latents

by Yilun Xu, Gabriele Corso, Tommi Jaakkola, Arash Vahdat, Karsten Kreis

Categories

GrooveSquid.com Paper Summaries

Keywords

Summary of Improving Zero-shot Generalization Of Learned Prompts Via Unsupervised Knowledge Distillation, by Marco Mistretta et al.

Summary of Planetarium: a Rigorous Benchmark For Translating Text to Structured Planning Languages, by Max Zuo et al.

Related Posts