Loading Now

Summary of Mudreamer: Learning Predictive World Models Without Reconstruction, by Maxime Burchi et al.


MuDreamer: Learning Predictive World Models without Reconstruction

by Maxime Burchi, Radu Timofte

First submitted to arxiv on: 23 May 2024

Categories

  • Main: Artificial Intelligence (cs.AI)
  • Secondary: Computer Vision and Pattern Recognition (cs.CV)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
The MuDreamer agent is a reinforcement learning algorithm that builds upon the DreamerV3 model to learn predictive world models without reconstructing input signals. Unlike DreamerV3, which uses pixel reconstruction loss, MuDreamer learns hidden representations by predicting the environment value function and previously selected actions. This approach allows for stronger robustness to visual distractions in the observation, as demonstrated on the DeepMind Visual Control Suite and Atari100k benchmark. Additionally, MuDreamer achieves faster training times compared to DreamerV3. The use of batch normalization is crucial to prevent learning collapse in predictive self-supervised methods.
Low GrooveSquid.com (original content) Low Difficulty Summary
The researchers created a new AI agent called MuDreamer that can learn from its environment without needing to recreate the information it sees. This is different from another popular AI model, DreamerV3, which tries to reconstruct what it sees. MuDreamer does better than DreamerV3 at ignoring distractions and can even work with real-world videos in the background. It also trains faster than DreamerV3 on certain tasks.

Keywords

» Artificial intelligence  » Batch normalization  » Reinforcement learning  » Self supervised