Summary of Improving Membership Inference in Asr Model Auditing with Perturbed Loss Features, by Francisco Teixeira et al.
Improving Membership Inference in ASR Model Auditing with Perturbed Loss Features
by Francisco Teixeira, Karla Pizzi, Raphael Olivier, Alberto Abad, Bhiksha Raj, Isabel Trancoso
First submitted to arxiv on: 2 May 2024
Categories
- Main: Machine Learning (cs.LG)
- Secondary: Cryptography and Security (cs.CR); Sound (cs.SD); Audio and Speech Processing (eess.AS)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary The proposed approach combines loss-based features with Gaussian and adversarial perturbations to perform Membership Inference (MI) on Automatic Speech Recognition (ASR) models. This technique has not been explored before, and the authors compare it to commonly used error-based features. The results show that the proposed features significantly improve sample-level MI performance, while for speaker-level MI, they provide a smaller but still notable improvement over error-based features. The study highlights the importance of considering different feature sets and levels of access to target models for effective MI in ASR systems, providing valuable insights for auditing such models. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary ASR systems can be private threats if their training data is used without permission. This paper explores a new way to check how well these systems protect user data. They mix two types of features together: ones that measure loss (like mistakes) and others that add noise or distortions. They compare this method to another common approach and find that it works much better for checking individual samples, but not as much for identifying entire speakers. This study shows why we need different approaches and levels of access to make sure ASR systems keep user data private. |
Keywords
» Artificial intelligence » Inference