Summary of One-shot Imitation in a Non-stationary Environment Via Multi-modal Skill, by Sangwoo Shin et al.
One-shot Imitation in a Non-Stationary Environment via Multi-Modal Skill
by Sangwoo Shin, Daehee Lee, Minjong Yoo, Woo Kyung Kim, Honguk Woo
First submitted to arxiv on: 13 Feb 2024
Categories
- Main: Artificial Intelligence (cs.AI)
- Secondary: None
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary This paper presents a novel skill-based imitation learning framework that enables one-shot imitation and zero-shot adaptation for complex tasks. The framework infers a semantic skill sequence from a single demonstration and optimizes each skill for environmental hidden dynamics. The approach leverages a vision-language model to learn a semantic skill set from offline video datasets, allowing for adaptation to different conditions and modalities. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary Imagine you’re trying to teach someone how to do something new, like riding a bike. You show them how to balance, pedal, and steer once, and they have to figure it out from there. This is called one-shot imitation. It’s hard, especially when the environment changes or gets more complex. The researchers in this paper came up with a way to make one-shot imitation easier by breaking down complex tasks into smaller skills. They used a special kind of computer model that can understand both pictures and words to learn these skills from videos. This allowed them to adapt to changing environments and different demonstration conditions. |
Keywords
» Artificial intelligence » Language model » One shot » Zero shot