Loading Now

Summary of Video Occupancy Models, by Manan Tomar et al.


Video Occupancy Models

by Manan Tomar, Philippe Hansen-Estruch, Philip Bachman, Alex Lamb, John Langford, Matthew E. Taylor, Sergey Levine

First submitted to arxiv on: 25 Jun 2024

Categories

  • Main: Computer Vision and Pattern Recognition (cs.CV)
  • Secondary: Artificial Intelligence (cs.AI)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
The proposed Video Occupancy models (VOCs) are a novel family of video prediction models designed to support downstream control tasks. These models operate in a compact latent space, eliminating the need for pixel-level predictions. Unlike prior latent-space world models, VOCs directly predict the discounted distribution of future states in a single step, avoiding multistep roll-outs. The benefits of these properties are demonstrated when building predictive models of video for use in downstream control.
Low GrooveSquid.com (original content) Low Difficulty Summary
Video prediction models can help with controlling robots or self-driving cars by predicting what will happen next in a video. This new type of model is called the Video Occupancy Model (VOC). It’s special because it works in a simple, compact way and only needs to predict what might happen in one step, not multiple steps like other models do.

Keywords

» Artificial intelligence  » Latent space