Summary of Sims: Simulating Stylized Human-scene Interactions with Retrieval-augmented Script Generation, by Wenjia Wang et al.
SIMS: Simulating Stylized Human-Scene Interactions with Retrieval-Augmented Script Generation
by Wenjia Wang, Liang Pan, Zhiyang Dou, Jidong Mei, Zhouyingcheng Liao, Yuke Lou, Yifan Wu, Lei Yang, Jingbo Wang, Taku Komura
First submitted to arxiv on: 29 Nov 2024
Categories
- Main: Computer Vision and Pattern Recognition (cs.CV)
- Secondary: Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Graphics (cs.GR)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary The proposed hierarchical framework, SIMS, aims to simulate stylized human-scene interactions (HSI) in physical environments by seamlessly bridging high-level script-driven intent with a low-level control policy. This approach enables more expressive and diverse HSI motions through the integration of Large Language Models with Retrieval-Augmented Generation (RAG) for generating coherent and diverse long-form scripts, and a versatile multicondition physics-based control policy that leverages text embeddings from the generated scripts to encode stylistic cues. The SIMS framework is tested on various tasks and scenarios, demonstrating its effectiveness in executing HSI motions while generalizing across different situations. The results show that SIMS outperforms previous methods, making it a promising solution for simulating realistic and diverse human-scene interactions. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary SIMS is a new way to make computers simulate how people interact with their surroundings. This is important because it can help us learn more about how humans behave in different situations, which is useful for things like creating robots that can work safely alongside humans or developing video games that are realistic and fun. The system uses two main parts: one that generates scripts for the interaction and another that controls the movement of objects in the scene. This combination allows the computer to create a wide range of interactions that look and feel more natural. |
Keywords
» Artificial intelligence » Rag » Retrieval augmented generation