Summary of Exploiting Data Hierarchy As a New Modality For Contrastive Learning, by Arjun Bhalla et al.
Exploiting Data Hierarchy as a New Modality for Contrastive Learning
by Arjun Bhalla, Daniel Levenson, Jan Bernhard, Anton Abilov
First submitted to arxiv on: 6 Jan 2024
Categories
- Main: Computer Vision and Pattern Recognition (cs.CV)
- Secondary: Artificial Intelligence (cs.AI)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary This paper explores how structured data can aid neural networks in learning abstract representations of cathedrals. The study utilizes the WikiScenes dataset, which organizes cathedral components in a hierarchical manner. A novel contrastive training approach is proposed to leverage this spatial hierarchy in the encoder’s latent space using a triplet margin loss. The method investigates whether the dataset structure provides valuable information for self-supervised learning. To visualize the results, t-SNE is applied to the latent space, and the proposed approach is evaluated against other dataset-specific contrastive learning methods using a common downstream classification task. The findings suggest that dataset structure is a valuable modality for weakly-supervised learning. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary This research looks at how data about cathedrals can help computers learn to understand complex ideas about buildings. The study uses a special kind of data called WikiScenes, which organizes cathedral parts in a way that makes sense. A new way of training the computer is proposed to use this organization to teach it new things without needing labels. The method tries to see if using this structure can help the computer learn on its own. To show what happened, the results are visualized with a special technique called t-SNE. It finds that using the dataset’s structure helps the computer learn better than other methods. |
Keywords
» Artificial intelligence » Classification » Encoder » Latent space » Self supervised » Supervised