Loading Now

Summary of Stop Reasoning! When Multimodal Llm with Chain-of-thought Reasoning Meets Adversarial Image, by Zefeng Wang et al.


Stop Reasoning! When Multimodal LLM with Chain-of-Thought Reasoning Meets Adversarial Image

by Zefeng Wang, Zhen Han, Shuo Chen, Fan Xue, Zifeng Ding, Xun Xiao, Volker Tresp, Philip Torr, Jindong Gu

First submitted to arxiv on: 22 Feb 2024

Categories

  • Main: Computer Vision and Pattern Recognition (cs.CV)
  • Secondary: Artificial Intelligence (cs.AI); Cryptography and Security (cs.CR); Machine Learning (cs.LG)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
Multimodal Large Language Models (MLLMs) have gained attention for their text and image understanding capabilities. Chain-of-Thought (CoT) reasoning, which provides intermediate reasoning steps, has been explored to enhance MLLMs’ explainability. However, recent studies show that MLLMs still struggle with adversarial images. This raises questions about the impact of CoT on MLLMs’ robustness against attacks. Our study generalizes existing attacks to CoT-based inferences by targeting rationale and answer components. We find that CoT improves adversarial robustness but not significantly. To overcome this, we propose a novel attack method, stop-reasoning attack, which targets the model while bypassing CoT reasoning. Experiments on three MLLMs and two visual reasoning datasets demonstrate the effectiveness of our proposed method, outperforming baseline attacks by a significant margin.
Low GrooveSquid.com (original content) Low Difficulty Summary
This paper is about how good Large Language Models are at understanding text and images. It also talks about how these models can explain their thinking process. But what happens when these models are tricked with fake images? Researchers found that these models still get fooled! They wanted to know if this “explanation” process helps or hurts the model’s ability to resist tricks. Surprisingly, it doesn’t make a big difference. To test how well the model can withstand tricks, they came up with a new way to trick it. This new method is really good at fooling the model and making it give wrong answers.

Keywords

* Artificial intelligence  * Attention