Summary of Cliploss and Norm-based Data Selection Methods For Multimodal Contrastive Learning, by Yiping Wang et al.

CLIPLoss and Norm-Based Data Selection Methods for Multimodal Contrastive Learning

by Yiping Wang, Yifang Chen, Wendan Yan, Alex Fang, Wenjing Zhou, Kevin Jamieson, Simon Shaolei Du

First submitted to arxiv on: 29 May 2024

GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty	Written by	Summary
High	Paper authors	High Difficulty Summary Read the original abstract here
Medium	GrooveSquid.com (original content)	Medium Difficulty Summary The paper proposes two novel data selection approaches for visual-language model pretraining, specifically targeting noisy web-curated datasets used in CLIP models. The first method, surrogate-CLIPLoss (s-CLIPLoss), modifies the classical CLIP score by incorporating contrastive pairs to improve quality measurement. The second method, NormSim, measures similarity between pretraining data and target data using a norm-based metric. The proposed methods are evaluated on the DataComp benchmark, achieving 5.3% and 2.8% improvements respectively over the best baseline, OpenAI’s CLIP-L/14.
Low	GrooveSquid.com (original content)	Low Difficulty Summary The paper introduces new methods for selecting high-quality data for visual-language model pretraining, which is crucial for large-scale models like CLIP. The authors propose two approaches to address noisy web-curated datasets: s-CLIPLoss and NormSim. These methods aim to improve data selection by considering the alignment between samples and their contrastive pairs, as well as the similarity between pretraining and target data.

Keywords

» Artificial intelligence » Alignment » Language model » Pretraining

CLIPLoss and Norm-Based Data Selection Methods for Multimodal Contrastive Learning

by Yiping Wang, Yifang Chen, Wendan Yan, Alex Fang, Wenjing Zhou, Kevin Jamieson, Simon Shaolei Du

Categories

GrooveSquid.com Paper Summaries

Keywords

Summary of Decentralized Optimization in Time-varying Networks with Arbitrary Delays, by Tomas Ortega et al.

Summary of Towards Deeper Understanding Of Ppr-based Embedding Approaches: a Topological Perspective, by Xingyi Zhang et al.

Related Posts