Loading Now

Summary of Better Representations Via Adversarial Training in Pre-training: a Theoretical Perspective, by Yue Xing et al.


Better Representations via Adversarial Training in Pre-Training: A Theoretical Perspective

by Yue Xing, Xiaofeng Lin, Qifan Song, Yi Xu, Belinda Zeng, Guang Cheng

First submitted to arxiv on: 26 Jan 2024

Categories

  • Main: Machine Learning (cs.LG)
  • Secondary: Machine Learning (stat.ML)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
This paper investigates the relationship between pre-trained models and their downstream applications in deep learning. Researchers have observed that when a model is pre-trained on a large dataset, it can inherit adversarial robustness from the pre-training phase. The authors provide theoretical explanations for this phenomenon, revealing that feature purification plays a crucial role. They show that with adversarial training, each hidden node tends to pick only one or a few features, making downstream tasks more resilient to attacks. This finding is applicable to both supervised pre-training and contrastive learning. The results suggest that clean training can be sufficient for achieving adversarial robustness in downstream tasks.
Low GrooveSquid.com (original content) Low Difficulty Summary
This paper looks at how big models learn things. They found out that when we train a model on lots of data, it can learn to be strong against fake examples. This is helpful because it means we don’t need to do extra work to make the model good at fighting fake stuff. The authors looked deeper and saw that the way the model works inside helps explain why this happens. They found that when we train the model in a special way, each part of the model becomes really good at focusing on just one thing, making it harder for fake examples to trick it.

Keywords

* Artificial intelligence  * Deep learning  * Supervised