Summary of Training-free Unsupervised Prompt For Vision-language Models, by Sifan Long et al.
Training-Free Unsupervised Prompt for Vision-Language Models
by Sifan Long, Linbin Wang, Zhen Zhao, Zichang Tan, Yiming Wu, Shengsheng Wang, Jingdong Wang
First submitted to arxiv on: 25 Apr 2024
Categories
- Main: Computer Vision and Pattern Recognition (cs.CV)
- Secondary: Artificial Intelligence (cs.AI)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary This paper proposes Training-Free Unsupervised Prompts (TFUP), a novel approach to adapt large pre-trained vision-language models to downstream tasks without explicit labeling or training. TFUP leverages pseudo-labels as supervisory information, but unlike existing methods, it maximally preserves the inherent representation capabilities and enhances them with a residual connection to similarity-based prediction probabilities. The approach integrates instance confidence and prototype scores to select representative samples, which are used to customize a reliable Feature Cache Model (FCM) for training-free inference. Additionally, a Multi-level Similarity Measure (MSM) is designed to calculate the distance between each test image and the cached sample as the weight of the corresponding cached label. TFUP achieves surprising performance, even surpassing the training-based method on multiple classification datasets. The paper also proposes a training-based approach (TFUP-T) that adopts an additional marginal distribution entropy loss to further boost adaptation performance. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary This paper is about finding ways to improve large computer models so they can be used for different tasks without needing lots of labeled data. The authors propose a new method called Training-Free Unsupervised Prompts (TFUP) that uses fake labels to help the model learn and adapt. TFUP works by selecting important samples from the data and using them to make predictions. The approach is surprisingly effective, even outperforming traditional methods on some datasets. The paper also suggests an improved version of this method called TFUP-T that can be used to further boost performance. |
Keywords
» Artificial intelligence » Classification » Inference » Unsupervised