Loading Now

Summary of Do Counterfactual Examples Complicate Adversarial Training?, by Eric Yeats et al.


Do Counterfactual Examples Complicate Adversarial Training?

by Eric Yeats, Cameron Darwin, Eduardo Ortega, Frank Liu, Hai Li

First submitted to arxiv on: 16 Apr 2024

Categories

  • Main: Machine Learning (cs.LG)
  • Secondary: Computer Vision and Pattern Recognition (cs.CV)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
This research paper explores the relationship between robustness and performance in machine learning models using diffusion models. The authors develop a simple, pre-trained diffusion method to generate low-norm counterfactual examples (CEs), which alter data semantics while maintaining true class membership. They find that the confidence and accuracy of robust models on clean training data are linked to the proximity of the data to their CEs. Additionally, robust models perform poorly when evaluated directly on CEs, as they become invariant to low-norm semantic changes. The results indicate a significant overlap between non-robust and semantic features, challenging the assumption that non-robust features are uninterpretable.
Low GrooveSquid.com (original content) Low Difficulty Summary
This study looks at how well machine learning models can work with tricky data. Researchers developed a new way to change normal data into “counterfactual examples” that have different answers. They found that strong models do poorly when tested on these changed data, and this is because they are designed to be resistant to changes in the data. The results show that even robust features can be linked to non-robust ones, which challenges our understanding of what makes a feature interpretable.

Keywords

» Artificial intelligence  » Diffusion  » Machine learning  » Semantics