Summary of All Roads Lead to Rome? Exploring Representational Similarities Between Latent Spaces Of Generative Image Models, by Charumathi Badrinath et al.

All Roads Lead to Rome? Exploring Representational Similarities Between Latent Spaces of Generative Image Models

by Charumathi Badrinath, Usha Bhalla, Alex Oesterling, Suraj Srinivas, Himabindu Lakkaraju

First submitted to arxiv on: 18 Jul 2024

GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty	Written by	Summary
High	Paper authors	High Difficulty Summary Read the original abstract here
Medium	GrooveSquid.com (original content)	Medium Difficulty Summary Generative image models like VAEs, GANs, Normalizing Flows (NFs), and Diffusion Models (DMs) are designed to produce realistic images. But do they all learn similar underlying representations? To investigate this, researchers measured the similarity of latent spaces between these four models. They used a technique called “stitching” to combine arbitrary pairs of encoders and decoders and evaluated them using various metrics. The results showed that linear maps between latent spaces of performant models preserve most visual information even when latent sizes differ. For example, gender was the most similarly represented probe-able attribute for CelebA models. Finally, researchers found that latent space representations converge early in training.
Low	GrooveSquid.com (original content)	Low Difficulty Summary Generative image models are like special computers that can create new pictures. Researchers wanted to know if these different models were learning similar things when they created images. To figure this out, they compared the “secret” representations used by four types of models: VAEs, GANs, NFs, and DMs. They did this by creating a special connection between each pair of models and then checking how well they worked together. The results showed that most of the important information was preserved when these connections were made. This is especially true for gender, which is an important feature in images of people. Overall, the researchers found that the secret representations used by these models become stable early on during training.

Keywords

* Artificial intelligence * Diffusion * Latent space

All Roads Lead to Rome? Exploring Representational Similarities Between Latent Spaces of Generative Image Models

by Charumathi Badrinath, Usha Bhalla, Alex Oesterling, Suraj Srinivas, Himabindu Lakkaraju

Categories

GrooveSquid.com Paper Summaries

Keywords

Summary of Enhancing Out-of-vocabulary Performance Of Indian Tts Systems For Practical Applications Through Low-effort Data Strategies, by Srija Anand et al.

Summary of Sa-dvae: Improving Zero-shot Skeleton-based Action Recognition by Disentangled Variational Autoencoders, By Sheng-wei Li and Zi-xiang Wei and Wei-jie Chen and Yi-hsin Yu and Chih-yuan Yang and Jane Yung-jen Hsu

Related Posts