Summary of Can Overfitted Deep Neural Networks in Adversarial Training Generalize? — An Approximation Viewpoint, by Zhongjie Shi et al.
Can overfitted deep neural networks in adversarial training generalize? – An approximation viewpoint
by Zhongjie Shi, Fanghui Liu, Yuan Cao, Johan A.K. Suykens
First submitted to arxiv on: 24 Jan 2024
Categories
- Main: Machine Learning (stat.ML)
- Secondary: Machine Learning (cs.LG)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary This paper investigates the relationship between adversarial training and overfitting in deep neural networks (DNNs). Adversarial training is a widely used method to improve the robustness of DNNs against perturbations. However, it is empirically observed that adversarial training on over-parameterized networks often suffers from robust overfitting, achieving low adversarial training error but poor robust generalization performance. The authors provide a theoretical understanding of this phenomenon by analyzing the approximation properties of overfitted DNNs in adversarial training. Specifically, they show that overfitted DNNs can achieve arbitrarily small adversarial training error while still maintaining good robust generalization error under certain conditions concerning data quality and perturbation level. The results also demonstrate the importance of model capacity for achieving robust generalization, highlighting the need for a trade-off between robustness and complexity in DNN design. The paper’s findings shed light on the mathematical foundations of robustness in DNNs from an approximation viewpoint. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary This research looks at how training deep neural networks to be more robust makes them worse off in some cases. When you over-train a network, it becomes really good at recognizing certain patterns, but that doesn’t mean it’s also good at making predictions on new, unseen data. The authors want to understand why this happens and what we can do about it. They show that when networks are very large, they can become too specialized in their training data and not generalize well to new situations. This is important because it means that just increasing the size of a network won’t necessarily make it more robust. Instead, you need to find a balance between how much you train the network and how complex it becomes. |
Keywords
* Artificial intelligence * Generalization * Overfitting