Loading Now

Summary of Training-free Unsupervised Prompt For Vision-language Models, by Sifan Long et al.


Training-Free Unsupervised Prompt for Vision-Language Models

by Sifan Long, Linbin Wang, Zhen Zhao, Zichang Tan, Yiming Wu, Shengsheng Wang, Jingdong Wang

First submitted to arxiv on: 25 Apr 2024

Categories

  • Main: Computer Vision and Pattern Recognition (cs.CV)
  • Secondary: Artificial Intelligence (cs.AI)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
This paper proposes Training-Free Unsupervised Prompts (TFUP), a novel approach to adapt large pre-trained vision-language models to downstream tasks without explicit labeling or training. TFUP leverages pseudo-labels as supervisory information, but unlike existing methods, it maximally preserves the inherent representation capabilities and enhances them with a residual connection to similarity-based prediction probabilities. The approach integrates instance confidence and prototype scores to select representative samples, which are used to customize a reliable Feature Cache Model (FCM) for training-free inference. Additionally, a Multi-level Similarity Measure (MSM) is designed to calculate the distance between each test image and the cached sample as the weight of the corresponding cached label. TFUP achieves surprising performance, even surpassing the training-based method on multiple classification datasets. The paper also proposes a training-based approach (TFUP-T) that adopts an additional marginal distribution entropy loss to further boost adaptation performance.
Low GrooveSquid.com (original content) Low Difficulty Summary
This paper is about finding ways to improve large computer models so they can be used for different tasks without needing lots of labeled data. The authors propose a new method called Training-Free Unsupervised Prompts (TFUP) that uses fake labels to help the model learn and adapt. TFUP works by selecting important samples from the data and using them to make predictions. The approach is surprisingly effective, even outperforming traditional methods on some datasets. The paper also suggests an improved version of this method called TFUP-T that can be used to further boost performance.

Keywords

» Artificial intelligence  » Classification  » Inference  » Unsupervised