Loading Now

Summary of Evaluating and Safeguarding the Adversarial Robustness Of Retrieval-based In-context Learning, by Simon Yu et al.


Evaluating and Safeguarding the Adversarial Robustness of Retrieval-Based In-Context Learning

by Simon Yu, Jie He, Pasquale Minervini, Jeff Z. Pan

First submitted to arxiv on: 24 May 2024

Categories

  • Main: Computation and Language (cs.CL)
  • Secondary: Artificial Intelligence (cs.AI)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
This paper investigates the robustness of In-Context Learning (ICL) methods against various types of adversarial attacks. The authors focus on retrieval-augmented ICL approaches, which leverage retrievers to extract semantically related examples as demonstrations. They show that these models can enhance robustness against test sample attacks, but are vulnerable to demonstration attacks. To address this issue, the authors propose an effective training-free adversarial defence method, DARD, which enriches the example pool with attacked samples. The results demonstrate a 15% reduction in Attack Success Rate (ASR) over baselines.
Low GrooveSquid.com (original content) Low Difficulty Summary
This paper looks at how well large language models do when faced with tricky questions or fake information. It shows that some ways of using these models are better than others at dealing with these challenges. The authors also introduce a new way to make the models more robust, which they call DARD. This method helps the models perform better and be less affected by bad information.

Keywords

» Artificial intelligence