Summary of Tracing Privacy Leakage Of Language Models to Training Data Via Adjusted Influence Functions, by Jinxin Liu and Zao Yang

Tracing Privacy Leakage of Language Models to Training Data via Adjusted Influence Functions

by Jinxin Liu, Zao Yang

First submitted to arxiv on: 20 Aug 2024

GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty	Written by	Summary
High	Paper authors	High Difficulty Summary Read the original abstract here
Medium	GrooveSquid.com (original content)	Medium Difficulty Summary This paper addresses privacy leakage concerns in Large Language Models (LLMs) by implementing Influence Functions (IFs) to trace the training data responsible for generating sensitive information. Current IFs struggle with estimating the influence of tokens with large gradient norms, leading to inaccurate tracing results. To overcome this limitation, the authors propose Heuristically Adjusted IF (HAIF), which adjusts the weight of such tokens, significantly improving tracing accuracy by 20.96% to 73.71% on the PII-E dataset and 3.21% to 45.93% on the PII-CR dataset. HAIF also outperforms state-of-the-art IFs on real-world pretraining data CLUECorpus2020, demonstrating robustness regardless of prompt and response lengths. The authors construct two datasets (PII-E and PII-CR) to establish groundtruth for tracing privacy leakage, highlighting the importance of considering token influence in language model training.
Low	GrooveSquid.com (original content)	Low Difficulty Summary Imagine a computer program that can write sentences or even entire articles. Sounds cool, right? But what if this program uses information from private documents or conversations without permission? This is called “privacy leakage.” In this paper, researchers developed a new way to track down the source of this leakage and prevent it from happening in the first place. They used something called “Influence Functions” to figure out which parts of the training data were causing the problem. However, they found that these functions weren’t perfect and sometimes pointed to the wrong places. To fix this, they created a new method called “Heuristically Adjusted IF” (HAIF) that corrects for these mistakes. HAIF was able to identify the real sources of privacy leakage much better than before, which is important for keeping our personal information safe.

Keywords

* Artificial intelligence * Language model * Pretraining * Prompt * Token

Tracing Privacy Leakage of Language Models to Training Data via Adjusted Influence Functions

by Jinxin Liu, Zao Yang

Categories

GrooveSquid.com Paper Summaries

Keywords

Summary of Learning Multimodal Latent Space with Ebm Prior and Mcmc Inference, by Shiyu Yuan et al.

Summary of Prformer: Pyramidal Recurrent Transformer For Multivariate Time Series Forecasting, by Yongbo Yu et al.

Related Posts