Loading Now

Summary of All Roads Lead to Rome? Exploring Representational Similarities Between Latent Spaces Of Generative Image Models, by Charumathi Badrinath et al.


All Roads Lead to Rome? Exploring Representational Similarities Between Latent Spaces of Generative Image Models

by Charumathi Badrinath, Usha Bhalla, Alex Oesterling, Suraj Srinivas, Himabindu Lakkaraju

First submitted to arxiv on: 18 Jul 2024

Categories

  • Main: Machine Learning (cs.LG)
  • Secondary: None

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
Generative image models like VAEs, GANs, Normalizing Flows (NFs), and Diffusion Models (DMs) are designed to produce realistic images. But do they all learn similar underlying representations? To investigate this, researchers measured the similarity of latent spaces between these four models. They used a technique called “stitching” to combine arbitrary pairs of encoders and decoders and evaluated them using various metrics. The results showed that linear maps between latent spaces of performant models preserve most visual information even when latent sizes differ. For example, gender was the most similarly represented probe-able attribute for CelebA models. Finally, researchers found that latent space representations converge early in training.
Low GrooveSquid.com (original content) Low Difficulty Summary
Generative image models are like special computers that can create new pictures. Researchers wanted to know if these different models were learning similar things when they created images. To figure this out, they compared the “secret” representations used by four types of models: VAEs, GANs, NFs, and DMs. They did this by creating a special connection between each pair of models and then checking how well they worked together. The results showed that most of the important information was preserved when these connections were made. This is especially true for gender, which is an important feature in images of people. Overall, the researchers found that the secret representations used by these models become stable early on during training.

Keywords

* Artificial intelligence  * Diffusion  * Latent space