Summary of Shape2scene: 3d Scene Representation Learning Through Pre-training on Shape Data, by Tuo Feng et al.
Shape2Scene: 3D Scene Representation Learning Through Pre-training on Shape Data
by Tuo Feng, Wenguan Wang, Ruijie Quan, Yi Yang
First submitted to arxiv on: 14 Jul 2024
Categories
- Main: Computer Vision and Pattern Recognition (cs.CV)
- Secondary: Artificial Intelligence (cs.AI)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary A novel pre-training method called Shape2Scene (S2S) is proposed to tackle the data desert issue in 3D self-supervised learning. The approach learns representations of large-scale 3D scenes from 3D shape data, which are easier to collect. A multiscale and high-resolution backbone, MH-P/V, is designed for point-based and voxel-based tasks, capturing deep semantic information across multiple scales. A Shape-to-Scene strategy (S2SS) amalgamates points from various shapes, creating a random pseudo scene for training data. A point-point contrastive loss (PPC) is applied to pre-train MH-P/V. The approach demonstrates transferability of 3D representations across shape-level and scene-level tasks. Notable performance is achieved on well-known datasets such as ScanObjectNN (93.8% OA) and ShapeNetPart (87.6% instance mIoU). The method also shows promising results in 3D semantic segmentation and 3D object detection. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary Shape2Scene is a new way to learn about 3D scenes using 3D shape data. This can be helpful because it’s easier to collect 3D shape data than 3D scene data. The method uses a special kind of neural network called MH-P/V that’s good at understanding both small and big details in 3D objects. It also combines points from different shapes into a fake 3D scene, which helps the network learn about scenes. This approach is tested on several tasks and shows good results. |
Keywords
» Artificial intelligence » Contrastive loss » Neural network » Object detection » Self supervised » Semantic segmentation » Transferability