Loading Now

Summary of Exploiting Llm Quantization, by Kazuki Egashira et al.


Exploiting LLM Quantization

by Kazuki Egashira, Mark Vero, Robin Staab, Jingxuan He, Martin Vechev

First submitted to arxiv on: 28 May 2024

Categories

  • Main: Machine Learning (cs.LG)
  • Secondary: Artificial Intelligence (cs.AI); Cryptography and Security (cs.CR)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
The paper investigates the security implications of quantization in large language models (LLMs), revealing a previously unknown attack vector that can be exploited to produce harmful quantized LLMs. The authors demonstrate a three-staged attack framework that first fine-tunes a malicious LLM, then quantizes it and calculates constraints for all full-precision models that map to the same quantized model, finally tuning out the poisoned behavior while ensuring the weights satisfy the calculated constraints. This results in an LLM that exhibits benign behavior in full precision but adversarial behavior when quantized. The attack is experimentally demonstrated across three diverse scenarios: vulnerable code generation, content injection, and over-refusal attack.
Low GrooveSquid.com (original content) Low Difficulty Summary
Large language models (LLMs) are used for many tasks such as translation, summarization, and chatbots. To make them work on devices with limited memory, researchers use a technique called quantization. This reduces the precision of the model’s weights to save memory. In this paper, scientists study how this process can be misused by attackers to create malicious models that seem safe but are actually dangerous. They show that an attacker could fine-tune a model on some task, then make it safer for users while keeping its bad behavior when used with less precision. This is a threat because millions of people use these models and don’t know they’re at risk.

Keywords

» Artificial intelligence  » Precision  » Quantization  » Summarization  » Translation