Summary of Sketch2nerf: Multi-view Sketch-guided Text-to-3d Generation, by Minglin Chen and Weihao Yuan and Yukun Wang and Zhe Sheng and Yisheng He and Zilong Dong and Liefeng Bo and Yulan Guo

Sketch2NeRF: Multi-view Sketch-guided Text-to-3D Generation

by Minglin Chen, Weihao Yuan, Yukun Wang, Zhe Sheng, Yisheng He, Zilong Dong, Liefeng Bo, Yulan Guo

First submitted to arxiv on: 25 Jan 2024

GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty	Written by	Summary
High	Paper authors	High Difficulty Summary Read the original abstract here
Medium	GrooveSquid.com (original content)	Medium Difficulty Summary Recently, text-to-3D approaches have achieved high-fidelity 3D content generation using text description. However, these methods lack fine-grained control over the generated objects. To address this limitation, researchers introduced sketches as a cheap way to provide control. Sketches are abstract and ambiguous, making it challenging to achieve flexible control from them. In response, we propose Sketch2NeRF, a multi-view sketch-guided text-to-3D generation framework. Our method leverages pretrained 2D diffusion models (Stable Diffusion and ControlNet) to supervise the optimization of a 3D scene represented by a neural radiance field (NeRF). We also introduce a novel synchronized generation and reconstruction method to effectively optimize the NeRF. The proposed method is evaluated using two multi-view sketch datasets, demonstrating high-fidelity text-to-3D generation with fine-grained sketch control. Our method achieves state-of-the-art performance in terms of sketch similarity and text alignment.
Low	GrooveSquid.com (original content)	Low Difficulty Summary Imagine being able to create 3D objects from just a few words! Recently, computers have gotten really good at doing this using only text descriptions. However, the resulting objects are not always perfect or exactly what we want. To make it easier to control the generation process, people use simple sketches as a guide. These sketches can be tricky to work with because they’re abstract and open to interpretation. In this paper, we develop a new way to generate 3D objects using text descriptions and sketches as guides. Our method is able to create highly detailed and realistic 3D objects that match the original sketch and text description. We tested our method on two different sets of sketches and found it performed better than other methods in terms of matching the sketch and text.

Keywords

» Artificial intelligence » Alignment » Diffusion » Optimization

Sketch2NeRF: Multi-view Sketch-guided Text-to-3D Generation

by Minglin Chen, Weihao Yuan, Yukun Wang, Zhe Sheng, Yisheng He, Zilong Dong, Liefeng Bo, Yulan Guo

Categories

GrooveSquid.com Paper Summaries

Keywords

Summary of Toward Robust Multimodal Learning Using Multimodal Foundational Models, by Xianbing Zhao et al.

Summary of Transformers and Cortical Waves: Encoders For Pulling in Context Across Time, by Lyle Muller et al.

Related Posts