Summary of Synthesizing Moving People with 3d Control, by Boyi Li et al.
Synthesizing Moving People with 3D Control
by Boyi Li, Junming Chen, Jathushan Rajasegaran, Yossi Gandelsman, Alexei A. Efros, Jitendra Malik
First submitted to arxiv on: 19 Jan 2024
Categories
- Main: Computer Vision and Pattern Recognition (cs.CV)
- Secondary: Artificial Intelligence (cs.AI)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary This paper presents a novel approach to animating people from a single image, using a diffusion model-based framework for generating 3D motion sequences. The method has two core components: learning priors about invisible body parts and clothing, and rendering novel poses with proper texture and clothing. The authors develop an in-filling diffusion model trained on texture map space, which allows for sample-efficient hallucination of unseen parts given a single image. They also design a diffusion-based rendering pipeline controlled by 3D human poses, producing realistic renderings of novel poses including clothing, hair, and plausible in-filling of unseen regions. The approach is disentangled, allowing for faithful generation of prolonged motions and varied challenging poses compared to prior methods. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary This paper creates a way to bring people to life from just one photo. It uses special computer models to make it happen! First, the model learns what hidden parts of our bodies look like, like our underwear or socks. Then, it takes that information and uses it to create new pictures of us doing different movements or poses. The pictures look really realistic and can even show us wearing different clothes or having different hairstyles. This is a big deal because it helps us understand how people move and behave in different situations. It’s like having a magic camera that can make anything happen! |
Keywords
* Artificial intelligence * Diffusion * Diffusion model * Hallucination