Summary of Interpretability-guided Test-time Adversarial Defense, by Akshay Kulkarni and Tsui-wei Weng
Interpretability-Guided Test-Time Adversarial Defense
by Akshay Kulkarni, Tsui-Wei Weng
First submitted to arxiv on: 23 Sep 2024
Categories
- Main: Computer Vision and Pattern Recognition (cs.CV)
- Secondary: Cryptography and Security (cs.CR); Machine Learning (cs.LG)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary We propose a novel, low-cost test-time adversarial defense method that leverages interpretability-guided neuron importance ranking to identify critical neurons in the output classes. This training-free approach improves the robustness-accuracy tradeoff while minimizing computational overhead. Our method is among the most efficient test-time defenses, with 4x faster execution speed, and effectively defends against a wide range of black-box, white-box, and adaptive attacks that previously compromised previous test-time defenses. We demonstrate its efficacy on CIFAR10, CIFAR100, and ImageNet-1k using the RobustBench benchmark, achieving average gains of 2.6%, 4.9%, and 2.8% respectively. Additionally, our method outperforms state-of-the-art test-time defenses under strong adaptive attacks with an average gain of 1.5%. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary This research proposes a new way to protect computer models from being fooled by fake data. The approach doesn’t require any special training and is very fast – four times faster than previous methods! It’s also super effective against all kinds of attacks, even the sneakiest ones. We tested this method on several datasets and found that it can improve performance by 2-4% compared to other state-of-the-art defenses. |