Loading Now

Summary of Stream: Spatio-temporal Evaluation and Analysis Metric For Video Generative Models, by Pum Jun Kim et al.


STREAM: Spatio-TempoRal Evaluation and Analysis Metric for Video Generative Models

by Pum Jun Kim, Seojun Kim, Jaejun Yoo

First submitted to arxiv on: 30 Jan 2024

Categories

  • Main: Computer Vision and Pattern Recognition (cs.CV)
  • Secondary: Artificial Intelligence (cs.AI)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
This paper proposes a new evaluation metric for video generative models, called STREAM. The current metrics used in the field are limited and don’t account for the unique characteristics of videos. The proposed metric is designed to independently evaluate the spatial and temporal aspects of videos, providing a more comprehensive analysis. This allows researchers to identify areas where their models can be improved. The paper also highlights the limitations of the widely used Frechet Video Distance (FVD) metric, which is constrained by the input size of the embedding networks.
Low GrooveSquid.com (original content) Low Difficulty Summary
This research focuses on creating a better way to measure how well video generative models work. Currently, people are using methods that were originally designed for images and aren’t suitable for videos. The new method, called STREAM, looks at both the visual quality and how natural the movements in the video are. This will help researchers create better videos that are more realistic and enjoyable to watch.

Keywords

» Artificial intelligence  » Embedding