Summary of One Prompt Word Is Enough to Boost Adversarial Robustness For Pre-trained Vision-language Models, by Lin Li et al.
One Prompt Word is Enough to Boost Adversarial Robustness for Pre-trained Vision-Language Models
by Lin Li, Haoyan Guan, Jianing Qiu, Michael Spratling
First submitted to arxiv on: 4 Mar 2024
Categories
- Main: Computer Vision and Pattern Recognition (cs.CV)
- Secondary: Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary In this research paper, the authors investigate the adversarial robustness of pre-trained Vision-Language Models (VLMs) from a novel perspective: the text prompt. They show that both attack and defense strategies are sensitive to the used text prompt and propose a method called Adversarial Prompt Tuning (APT) to improve resilience against adversarial attacks. APT is effective, efficient, and outperforms hand-engineered prompts and other state-of-the-art adaptation methods across 15 datasets and various data sparsity schemes. The authors demonstrate that simply adding one learned word to the prompts can significantly boost accuracy and robustness. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary Large pre-trained Vision-Language Models (VLMs) are great at understanding images and text, but they’re also really easy to trick. This paper tries to fix this problem by looking at something called text prompts. They show that changing these prompts can make VLMs more or less resistant to being tricked. The researchers then came up with a way to make these prompts better, which they call Adversarial Prompt Tuning (APT). It works really well and is faster than other methods. By just adding one word to the prompt, APT can make VLMs more accurate and harder to trick. |
Keywords
* Artificial intelligence * Prompt