Loading Now

Summary of Calibrating Practical Privacy Risks For Differentially Private Machine Learning, by Yuechun Gu et al.


Calibrating Practical Privacy Risks for Differentially Private Machine Learning

by Yuechun Gu, Keke Chen

First submitted to arxiv on: 30 Oct 2024

Categories

  • Main: Machine Learning (cs.LG)
  • Secondary: Cryptography and Security (cs.CR)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
Differentially private machine learning research often relies on the privacy budget epsilon (ε) to quantify privacy. However, recent studies show that different models and datasets can exhibit varying attacking success rates (ASR) for likelihood-ratio-based membership inference attacks, despite sharing the same theoretical ε setting. This highlights the need for a more practical approach to evaluating real-world privacy risks. In this paper, we investigate methods to lower ASR values without compromising data utility in model training. Our findings demonstrate that selectively suppressing privacy-sensitive features can achieve lower ASR values and relax theoretical ε settings while maintaining equivalent practical privacy protection. We use SHAP and LIME model explainers to evaluate feature sensitivities and develop feature-masking strategies. Experimental results show a strong link between ASR and dataset privacy risk, enabling the development of more flexible privacy budget settings.
Low GrooveSquid.com (original content) Low Difficulty Summary
This paper is about how we can make sure that our personal information stays private when using machine learning models. Right now, we use something called “epsilon” to measure privacy, but it’s not always clear what that means in real life. The researchers found that different models and data sets can have very different levels of success when trying to figure out if someone is a member of a group or not (this is called membership inference). They also found that by hiding certain features that are sensitive to privacy, they can make it harder for people to guess if someone is in the group. This means we can use more flexible settings for our models without sacrificing privacy. The researchers used special tools to figure out which features were most important and developed ways to hide them. Their results show that this method works well and could help us better understand how to keep our personal information private.

Keywords

* Artificial intelligence  * Inference  * Likelihood  * Machine learning