Summary of Clip Body and Tail Separately: High Probability Guarantees For Dpsgd with Heavy Tails, by Haichao Sha and Yang Cao and Yong Liu and Yuncheng Wu and Ruixuan Liu and Hong Chen
Clip Body and Tail Separately: High Probability Guarantees for DPSGD with Heavy Tails
by Haichao Sha, Yang Cao, Yong Liu, Yuncheng Wu, Ruixuan Liu, Hong Chen
First submitted to arxiv on: 27 May 2024
Categories
- Main: Machine Learning (cs.LG)
- Secondary: Cryptography and Security (cs.CR)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary A novel differentially private stochastic gradient descent (DPSGD) method, Discriminative Clipping~(DC)-DPSGD, is proposed to address the heavy-tail phenomenon in deep learning gradients. Existing DPSGD works assume sub-Gaussian distributions and design clipping mechanisms to optimize training performance. However, recent studies show that deep learning gradients exhibit infinite variance, leading to excessive clipping loss. DC-DPSGD introduces a subspace identification technique to distinguish between body and tail gradients, and a discriminative clipping mechanism that applies different thresholds for each. This approach reduces the empirical gradient norm under non-convex conditions, outperforming baselines by up to 9.72% in terms of accuracy on four real-world datasets. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary DPSGD is a way to keep training data private in deep learning. Right now, there are ways to do this that assume gradients follow certain patterns, but recent research shows that these assumptions don’t always hold true. This paper proposes a new approach called DC-DPSGD that can handle the unusual patterns it finds in real-world data. It does this by identifying different types of gradients and applying different rules for each one. The results show that this method is better than others at keeping data private while still getting good results. |
Keywords
* Artificial intelligence * Deep learning * Stochastic gradient descent