Summary of Mudreamer: Learning Predictive World Models Without Reconstruction, by Maxime Burchi et al.
MuDreamer: Learning Predictive World Models without Reconstruction
by Maxime Burchi, Radu Timofte
First submitted to arxiv on: 23 May 2024
Categories
- Main: Artificial Intelligence (cs.AI)
- Secondary: Computer Vision and Pattern Recognition (cs.CV)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary The MuDreamer agent is a reinforcement learning algorithm that builds upon the DreamerV3 model to learn predictive world models without reconstructing input signals. Unlike DreamerV3, which uses pixel reconstruction loss, MuDreamer learns hidden representations by predicting the environment value function and previously selected actions. This approach allows for stronger robustness to visual distractions in the observation, as demonstrated on the DeepMind Visual Control Suite and Atari100k benchmark. Additionally, MuDreamer achieves faster training times compared to DreamerV3. The use of batch normalization is crucial to prevent learning collapse in predictive self-supervised methods. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary The researchers created a new AI agent called MuDreamer that can learn from its environment without needing to recreate the information it sees. This is different from another popular AI model, DreamerV3, which tries to reconstruct what it sees. MuDreamer does better than DreamerV3 at ignoring distractions and can even work with real-world videos in the background. It also trains faster than DreamerV3 on certain tasks. |
Keywords
» Artificial intelligence » Batch normalization » Reinforcement learning » Self supervised