Loading Now

Summary of Quito-x: a New Perspective on Context Compression From the Information Bottleneck Theory, by Yihang Wang et al.


QUITO-X: A New Perspective on Context Compression from the Information Bottleneck Theory

by Yihang Wang, Xu Huang, Bowen Tian, Yueyang Su, Lei Yu, Huaming Liao, Yixing Fan, Jiafeng Guo, Xueqi Cheng

First submitted to arxiv on: 20 Aug 2024

Categories

  • Main: Computation and Language (cs.CL)
  • Secondary: Artificial Intelligence (cs.AI)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
The abstract describes a study that addresses the challenge of compressing long contexts in generative language models (LLMs) for industrial applications. The issue arises when LLMs are used for complex tasks, where excessive context leads to high costs and inference delays. To overcome this, researchers introduce information bottleneck theory (IB), which models the problem and provides a novel perspective on context compression. A cross-attention-based approach is proposed to approximate mutual information in IB, allowing for flexible alternatives in different scenarios. Experimental results on four datasets show that the method achieves a 25% increase in compression rate compared to state-of-the-art methods while maintaining question answering performance.
Low GrooveSquid.com (original content) Low Difficulty Summary
The study aims to help generative language models (LLMs) work better with long contexts. Right now, LLMs struggle when they have too much information to consider. This makes them slow and expensive to use. The researchers found a new way to think about this problem using something called information bottleneck theory. They also created a special method that helps LLMs focus on the most important parts of the context. By doing so, LLMs can learn more quickly and make better decisions.

Keywords

» Artificial intelligence  » Cross attention  » Inference  » Question answering