Summary of Exploiting Llm Quantization, by Kazuki Egashira et al.
Exploiting LLM Quantization
by Kazuki Egashira, Mark Vero, Robin Staab, Jingxuan He, Martin Vechev
First submitted to arxiv on: 28 May 2024
Categories
- Main: Machine Learning (cs.LG)
- Secondary: Artificial Intelligence (cs.AI); Cryptography and Security (cs.CR)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary The paper investigates the security implications of quantization in large language models (LLMs), revealing a previously unknown attack vector that can be exploited to produce harmful quantized LLMs. The authors demonstrate a three-staged attack framework that first fine-tunes a malicious LLM, then quantizes it and calculates constraints for all full-precision models that map to the same quantized model, finally tuning out the poisoned behavior while ensuring the weights satisfy the calculated constraints. This results in an LLM that exhibits benign behavior in full precision but adversarial behavior when quantized. The attack is experimentally demonstrated across three diverse scenarios: vulnerable code generation, content injection, and over-refusal attack. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary Large language models (LLMs) are used for many tasks such as translation, summarization, and chatbots. To make them work on devices with limited memory, researchers use a technique called quantization. This reduces the precision of the model’s weights to save memory. In this paper, scientists study how this process can be misused by attackers to create malicious models that seem safe but are actually dangerous. They show that an attacker could fine-tune a model on some task, then make it safer for users while keeping its bad behavior when used with less precision. This is a threat because millions of people use these models and don’t know they’re at risk. |
Keywords
» Artificial intelligence » Precision » Quantization » Summarization » Translation