Summary of Time Will Tell: Timing Side Channels Via Output Token Count in Large Language Models, by Tianchen Zhang et al.

Time Will Tell: Timing Side Channels via Output Token Count in Large Language Models

by Tianchen Zhang, Gururaj Saileshwar, David Lie

First submitted to arxiv on: 19 Dec 2024

GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty	Written by	Summary
High	Paper authors	High Difficulty Summary Read the original abstract here
Medium	GrooveSquid.com (original content)	Medium Difficulty Summary The paper presents a novel side-channel attack that enables an adversary to extract sensitive information about inference inputs in large language models (LLMs) based on the number of output tokens. The attack is demonstrated in two common LLM tasks: machine translation and classification. Experiments show that the attack can successfully recover the target language in machine translation tasks with more than 75% precision across three different models, as well as the input class in text classification tasks with more than 70% precision from open-source LLMs and production models. The paper also proposes mitigations against the output token count side-channel.
Low	GrooveSquid.com (original content)	Low Difficulty Summary This research shows how an attacker can figure out what language someone is trying to translate or which category something belongs to, just by looking at the response from a large language model. The model uses this information to make predictions, but it’s not secure enough. The researchers found that if you have access to the model and can see the responses, you can guess the right answer most of the time.

Keywords

* Artificial intelligence * Classification * Inference * Large language model * Precision * Text classification * Token * Translation

Time Will Tell: Timing Side Channels via Output Token Count in Large Language Models

by Tianchen Zhang, Gururaj Saileshwar, David Lie

Categories

GrooveSquid.com Paper Summaries

Keywords

Summary of Tinyllm: a Framework For Training and Deploying Language Models at the Edge Computers, by Savitha Viswanadh Kandala et al.

Summary of Difficulty-aware Balancing Margin Loss For Long-tailed Recognition, by Minseok Son et al.

Related Posts