Loading Now

Summary of Clip Body and Tail Separately: High Probability Guarantees For Dpsgd with Heavy Tails, by Haichao Sha and Yang Cao and Yong Liu and Yuncheng Wu and Ruixuan Liu and Hong Chen


Clip Body and Tail Separately: High Probability Guarantees for DPSGD with Heavy Tails

by Haichao Sha, Yang Cao, Yong Liu, Yuncheng Wu, Ruixuan Liu, Hong Chen

First submitted to arxiv on: 27 May 2024

Categories

  • Main: Machine Learning (cs.LG)
  • Secondary: Cryptography and Security (cs.CR)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
A novel differentially private stochastic gradient descent (DPSGD) method, Discriminative Clipping~(DC)-DPSGD, is proposed to address the heavy-tail phenomenon in deep learning gradients. Existing DPSGD works assume sub-Gaussian distributions and design clipping mechanisms to optimize training performance. However, recent studies show that deep learning gradients exhibit infinite variance, leading to excessive clipping loss. DC-DPSGD introduces a subspace identification technique to distinguish between body and tail gradients, and a discriminative clipping mechanism that applies different thresholds for each. This approach reduces the empirical gradient norm under non-convex conditions, outperforming baselines by up to 9.72% in terms of accuracy on four real-world datasets.
Low GrooveSquid.com (original content) Low Difficulty Summary
DPSGD is a way to keep training data private in deep learning. Right now, there are ways to do this that assume gradients follow certain patterns, but recent research shows that these assumptions don’t always hold true. This paper proposes a new approach called DC-DPSGD that can handle the unusual patterns it finds in real-world data. It does this by identifying different types of gradients and applying different rules for each one. The results show that this method is better than others at keeping data private while still getting good results.

Keywords

* Artificial intelligence  * Deep learning  * Stochastic gradient descent