Summary of Expose Before You Defend: Unifying and Enhancing Backdoor Defenses Via Exposed Models, by Yige Li et al.
Expose Before You Defend: Unifying and Enhancing Backdoor Defenses via Exposed Models
by Yige Li, Hanxun Huang, Jiaming Zhang, Xingjun Ma, Yu-Gang Jiang
First submitted to arxiv on: 25 Oct 2024
Categories
- Main: Artificial Intelligence (cs.AI)
- Secondary: None
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary This paper presents a novel two-step defense framework, Expose Before You Defend (EBYD), to combat backdoor attacks in deep neural networks. Backdoor attacks involve poisoning training data with pre-designed triggers, which can be exploited in large models trained on web-crawled datasets. EBYD first exposes the backdoor functionality through a preprocessing step called backdoor exposure, followed by detection and removal of backdoor features. The framework includes Clean Unlearning (CUL), a novel technique to unlearn clean features from the backdoored model, as well as model editing/modification techniques like fine-tuning, sparsification, and weight perturbation. Extensive experiments on 10 image and 6 text attacks across various datasets demonstrate the importance of backdoor exposure for defense, showing benefits in downstream tasks like backdoor label detection, trigger recovery, model detection, and removal. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary This paper is about a new way to protect computers from being tricked by hidden “backdoors” in artificial intelligence (AI) models. Backdoors are sneaky attacks that can be used to control AI models without anyone noticing. The authors created a special tool called Expose Before You Defend, or EBYD, to find and fix these backdoors before they cause harm. They tested their tool on many different types of images and text data and showed that it can help detect and remove backdoors from AI models. This is important because as AI becomes more powerful, the risk of backdoors being used to manipulate or deceive us grows. |
Keywords
» Artificial intelligence » Fine tuning