Summary of Is Less More? Exploring Token Condensation As Training-free Test-time Adaptation, by Zixin Wang et al.

Is Less More? Exploring Token Condensation as Training-free Test-time Adaptation

by Zixin Wang, Dong Gong, Sen Wang, Zi Huang, Yadan Luo

First submitted to arxiv on: 16 Oct 2024

GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty	Written by	Summary
High	Paper authors	High Difficulty Summary Read the original abstract here
Medium	GrooveSquid.com (original content)	Medium Difficulty Summary Contrastive Language-Image Pretraining (CLIP) excels in learning generalizable image representations but struggles in zero-shot inference on specific downstream datasets. To address this, researchers investigate token condensation (TC) techniques to refine token usage during inference and improve visual-text alignment in VLMs like CLIP on unseen datasets. However, existing TC methods often fail to maintain in-distribution performance when reducing tokens, prompting the development of a new training-free adaptation method called Token Condensation as Adaptation (TCA). TCA condenses token representation by introducing reservoir-based domain anchor tokens for information-preserving token reduction and logits correction. The proposed method achieves up to a 21.4% performance improvement over the strongest baseline on cross-dataset benchmark and CIFAR-100-Corrupted dataset while reducing GFLOPs by 12.2% to 48.9%, with minimal hyperparameter dependency on both CLIP and SigLIP series.
Low	GrooveSquid.com (original content)	Low Difficulty Summary Researchers are trying to make a powerful AI model called Contrastive Language-Image Pretraining (CLIP) work better on new tasks without needing extra training. One way they’re doing this is by changing how the model uses small groups of information, or “tokens”, during testing. This helps the model understand images and text better on new datasets. The team also came up with a new method called Token Condensation as Adaptation (TCA) that makes the model even better without needing more training. TCA works by adjusting how tokens are used and making corrections to improve performance. With this new approach, the model can do tasks 21.4% better than before and use less computer power.

Keywords

* Artificial intelligence * Alignment * Hyperparameter * Inference * Logits * Pretraining * Prompting * Token * Zero shot

Is Less More? Exploring Token Condensation as Training-free Test-time Adaptation

by Zixin Wang, Dong Gong, Sen Wang, Zi Huang, Yadan Luo

Categories

GrooveSquid.com Paper Summaries

Keywords

Summary of Rethinking Token Reduction For State Space Models, by Zheng Zhan et al.

Summary of Leveraging Intra-period and Inter-period Features For Enhanced Passenger Flow Prediction Of Subway Stations, by Xiannan Huang et al.

Related Posts