Loading Now

Summary of Normalizing Flows Are Capable Generative Models, by Shuangfei Zhai et al.


Normalizing Flows are Capable Generative Models

by Shuangfei Zhai, Ruixiang Zhang, Preetum Nakkiran, David Berthelot, Jiatao Gu, Huangjie Zheng, Tianrong Chen, Miguel Angel Bautista, Navdeep Jaitly, Josh Susskind

First submitted to arxiv on: 9 Dec 2024

Categories

  • Main: Computer Vision and Pattern Recognition (cs.CV)
  • Secondary: Machine Learning (cs.LG)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
Normalizing Flows (NFs) are a type of likelihood-based model for continuous inputs, which have shown promise in density estimation and generative modeling tasks. However, they have received relatively little attention recently. This paper demonstrates that NFs are more powerful than previously believed by introducing TarFlow: a simple and scalable architecture that enables highly performant NF models. TarFlow is a Transformer-based variant of Masked Autoregressive Flows (MAFs) consisting of autoregressive Transformer blocks on image patches, alternating the direction between layers. It can be trained end-to-end and directly models and generates pixels. The authors also propose three techniques to improve sample quality: Gaussian noise augmentation during training, post-training denoising, and an effective guidance method for class-conditional and unconditional settings. As a result, TarFlow sets new state-of-the-art results on likelihood estimation for images, outperforming previous methods by a large margin, and generates samples with comparable quality and diversity to diffusion models.
Low GrooveSquid.com (original content) Low Difficulty Summary
Normalizing Flows are special kinds of computer programs that help us understand and generate data like pictures. They’re very good at this job, but not many people have been using them lately. This new paper shows that these programs can actually do a lot more than we thought they could. It introduces something called TarFlow, which is really fast and good at making predictions about what images should look like. The authors also came up with some new tricks to make the pictures it makes even better. As a result, this program is now the best one out there for guessing what an image should look like, and it can even create new images that are very realistic.

Keywords

» Artificial intelligence  » Attention  » Autoregressive  » Density estimation  » Likelihood  » Transformer