Summary of Information Leakage From Embedding in Large Language Models, by Zhipeng Wan et al.
Information Leakage from Embedding in Large Language Models
by Zhipeng Wan, Anda Cheng, Yinggui Wang, Lei Wang
First submitted to arxiv on: 20 May 2024
Categories
- Main: Machine Learning (cs.LG)
- Secondary: Cryptography and Security (cs.CR)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary The proposed study investigates potential privacy invasions through input reconstruction attacks on large language models (LLMs). The research focuses on developing methods to reconstruct original texts from model hidden states, specifically exploring effectiveness across shallow and deep layers. Two base methods are introduced, which demonstrate varying degrees of success in attacking embeddings. To address limitations, the Embed Parrot method is proposed, showcasing stable performance in reconstructing inputs from ChatGLM-6B and Llama2-7B models. Additionally, a defense mechanism is introduced to deter exploitation of the embedding reconstruction process, highlighting the importance of safeguarding user privacy in distributed learning systems. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary Large language models (LLMs) are super smart computers that can understand and generate human-like text. But some people worry that these models could be used to invade our personal privacy. A team of researchers wants to find out if this is possible. They’re looking at a type of attack called input reconstruction, where someone tries to figure out what you originally typed into the model just by looking at the way it processed your words. The scientists have developed two new ways to do this and tested them on some big language models. One of these methods works really well when attacking the simpler parts of the model’s understanding, but not as well for the deeper parts. To solve this problem, they came up with a new method called Embed Parrot that can reconstruct what you typed from even the deepest parts of the model. The researchers also found a way to make it harder for someone to use these methods to invade our privacy. All of this is important because we want to keep our personal information safe when using language models. |
Keywords
» Artificial intelligence » Embedding