Loading Now

Summary of Genie: Generative Interactive Environments, by Jake Bruce et al.


Genie: Generative Interactive Environments

by Jake Bruce, Michael Dennis, Ashley Edwards, Jack Parker-Holder, Yuge Shi, Edward Hughes, Matthew Lai, Aditi Mavalankar, Richie Steigerwald, Chris Apps, Yusuf Aytar, Sarah Bechtle, Feryal Behbahani, Stephanie Chan, Nicolas Heess, Lucy Gonzalez, Simon Osindero, Sherjil Ozair, Scott Reed, Jingwei Zhang, Konrad Zolna, Jeff Clune, Nando de Freitas, Satinder Singh, Tim Rocktäschel

First submitted to arxiv on: 23 Feb 2024

Categories

  • Main: Machine Learning (cs.LG)
  • Secondary: Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
This paper introduces Genie, a novel generative interactive environment trained on unlabelled Internet videos without any supervision. The model can generate endless virtual worlds described through various modalities like text, images, photographs, and sketches. With 11 billion parameters, Genie is considered a foundation world model comprising a spatiotemporal video tokenizer, autoregressive dynamics model, and scalable latent action model. Users can interact with the generated environments on a frame-by-frame basis without any ground-truth action labels or domain-specific requirements. The learned latent action space also enables training agents to imitate behaviors from unseen videos, paving the way for training generalist agents in the future.
Low GrooveSquid.com (original content) Low Difficulty Summary
Genie is a new kind of virtual world that can be made using text, pictures, and more! It was trained on lots of videos online without any help or labels. This means you can control what happens in the world step by step, even if it wasn’t shown before. The people who made Genie think this could be an important step towards making robots or computers that can learn from many different things.

Keywords

* Artificial intelligence  * Autoregressive  * Spatiotemporal  * Tokenizer