Summary of Scalable Dp-sgd: Shuffling Vs. Poisson Subsampling, by Lynn Chua et al.
Scalable DP-SGD: Shuffling vs. Poisson Subsampling
by Lynn Chua, Badih Ghazi, Pritish Kamath, Ravi Kumar, Pasin Manurangsi, Amer Sinha, Chiyuan Zhang
First submitted to arxiv on: 6 Nov 2024
Categories
- Main: Machine Learning (cs.LG)
- Secondary: Cryptography and Security (cs.CR); Data Structures and Algorithms (cs.DS)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary A novel analysis provides lower bounds on the privacy guarantees of the Adaptive Batch Linear Queries (ABLQ) mechanism with shuffled batch sampling, highlighting substantial gaps compared to Poisson subsampling. This challenges the common practice of implementing Differentially Private Stochastic Gradient Descent (DP-SGD) using shuffling-based ABLQ while reporting privacy parameters as if Poisson subsampling was used. To mitigate this gap’s impact on model utility, a practical approach is introduced for implementing Poisson subsampling at scale using massively parallel computation. This enables efficient training of models with the same level of privacy protection. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary This paper finds that a commonly used method in machine learning might not be as private as thought. They looked at how well some algorithms protect people’s information and found that one popular way of doing this, called shuffling-based ABLQ, is actually less private than previously thought. To fix this problem, they came up with a new way to do the same thing, but with better privacy protection. This could help keep people’s data safer. |
Keywords
» Artificial intelligence » Machine learning » Stochastic gradient descent