Loading Now

Summary of Large Language Model Unlearning Via Embedding-corrupted Prompts, by Chris Yuhao Liu et al.


Large Language Model Unlearning via Embedding-Corrupted Prompts

by Chris Yuhao Liu, Yaxuan Wang, Jeffrey Flanigan, Yang Liu

First submitted to arxiv on: 12 Jun 2024

Categories

  • Main: Computation and Language (cs.CL)
  • Secondary: Artificial Intelligence (cs.AI); Machine Learning (cs.LG)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
Large language models (LLMs) have made significant progress in various domains, but controlling what they should not know is crucial for their safe use. The challenge lies in efficiently and accurately unlearning knowledge from these models without causing collateral damage. To address this issue, we present Embedding-COrrupted (ECO) Prompts, a lightweight framework that enforces an unlearned state during inference by identifying and safeguarding prompts to forget. Our method uses zeroth-order optimization to corrupt prompt embeddings offline and flags corruptions for the classifier during inference. We demonstrate our approach’s effectiveness in achieving promising unlearning with nearly zero side effects in general domains and closely related ones, even scaling up to 100 LLMs ranging from 0.5B to 236B parameters.
Low GrooveSquid.com (original content) Low Difficulty Summary
Large language models are really smart computers that can understand lots of information. But sometimes it’s important to make them forget something they learned. This is hard because the model has a lot of information stuck together, and making it forget one thing might accidentally make it forget something else too. Our solution is called ECO Prompts. It helps the model remember what to forget by looking at certain prompts or clues. We tested our method with many language models and showed that it can make them safely forget information without messing anything up. This is important because we want to use these models in a way that’s safe for everyone.

Keywords

» Artificial intelligence  » Embedding  » Inference  » Optimization  » Prompt