Summary of Test-time Backdoor Attacks on Multimodal Large Language Models, by Dong Lu et al.

Test-Time Backdoor Attacks on Multimodal Large Language Models

by Dong Lu, Tianyu Pang, Chao Du, Qian Liu, Xianjun Yang, Min Lin

First submitted to arxiv on: 13 Feb 2024

GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty	Written by	Summary
High	Paper authors	High Difficulty Summary Read the original abstract here
Medium	GrooveSquid.com (original content)	Medium Difficulty Summary The proposed paper, AnyDoor, introduces a novel test-time backdoor attack method against multimodal large language models (MLLMs). The attack involves contaminating textual modality using adversarial test images sharing the same universal perturbation, without requiring access to or modification of training data. This medium-difficulty summary highlights how AnyDoor decouples timing of setup and activation of harmful effects, validating its effectiveness against popular MLLMs like LLaVA-1.5, MiniGPT-4, InstructBLIP, and BLIP-2 through ablation studies.
Low	GrooveSquid.com (original content)	Low Difficulty Summary AnyDoor is a new way to hack into big language models that can understand text and images. The bad guys can make the model do something harmful by adding a special trigger to the test images. This means they don’t need to change anything in how the model was trained. The researchers tested AnyDoor against some popular models and found it worked well. They also showed that this attack is hard to detect because the trigger prompt/harmful effect can be changed on the fly.

Keywords

* Artificial intelligence * Prompt

Test-Time Backdoor Attacks on Multimodal Large Language Models

by Dong Lu, Tianyu Pang, Chao Du, Qian Liu, Xianjun Yang, Min Lin

Categories

GrooveSquid.com Paper Summaries

Keywords

Summary of Transfer Operators From Batches Of Unpaired Points Via Entropic Transport Kernels, by Florian Beier et al.

Summary of Mixtures Of Experts Unlock Parameter Scaling For Deep Rl, by Johan Obando-ceron et al.

Related Posts