Summary of Knowledge Distillation-based Model Extraction Attack Using Gan-based Private Counterfactual Explanations, by Fatima Ezzeddine et al.
Knowledge Distillation-Based Model Extraction Attack using GAN-based Private Counterfactual Explanations
by Fatima Ezzeddine, Omran Ayoub, Silvia Giordano
First submitted to arxiv on: 4 Apr 2024
Categories
- Main: Machine Learning (cs.LG)
- Secondary: Artificial Intelligence (cs.AI); Cryptography and Security (cs.CR); Computers and Society (cs.CY)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary The proposed work focuses on investigating the exploitation of model explanations, particularly counterfactual explanations (CFs), for performing model extraction attacks (MEA) within machine learning as a service (MLaaS) platforms. The authors propose a novel approach based on Knowledge Distillation (KD) to enhance the efficiency of extracting a substitute model by exploiting CFs without knowledge about the training data distribution. Additionally, they advise an approach for training CF generators incorporating differential privacy (DP) to generate private CFs. Experimental evaluations on real-world datasets demonstrate that the proposed KD-based MEA can yield high-fidelity substitute models with reduced queries and mitigating MEA. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary Machine learning as a service (MLaaS) has become popular, but it raises concerns about vulnerabilities in ML model explanations. This study investigates how counterfactual explanations (CFs) can be used to perform model extraction attacks (MEA). The researchers propose a new way to extract a model using CFs and test it on real-world data. They also explore adding privacy protection to the model, which helps reduce the risk of MEA. |
Keywords
* Artificial intelligence * Knowledge distillation * Machine learning