Summary of Two Are Better Than One: Context Window Extension with Multi-grained Self-injection, by Wei Han et al.

Two are better than one: Context window extension with multi-grained self-injection

by Wei Han, Pan Zhou, Soujanya Poria, Shuicheng Yan

First submitted to arxiv on: 25 Oct 2024

GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty	Written by	Summary
High	Paper authors	High Difficulty Summary Read the original abstract here
Medium	GrooveSquid.com (original content)	Medium Difficulty Summary The proposed SharedLLM approach addresses the limited context window issue in large language models (LLMs) by introducing a novel multi-grained context compression and query-aware information retrieval mechanism. This architecture comprises two short-context LLMs, including the well-known LLaMA-2 model, which serves as an upper model and lower model. The lower model functions as a compressor, while the upper model acts as a decoder, performing context-aware modeling on running text. A tree-style data structure is introduced to efficiently encode, store, and retrieve multi-grained contextual information for text chunks. This structure enables rapid encoding and retrieval of relevant information from various levels based on input queries. The self-injection process, derived from the same LLM layer, facilitates efficient information transfer.
Low	GrooveSquid.com (original content)	Low Difficulty Summary SharedLLM is a new approach that helps large language models (LLMs) remember more context. Currently, these models can only understand a limited amount of text at a time. To fix this issue, SharedLLM uses two small models to compress and retrieve contextual information. This makes it faster and cheaper to use LLMs for many applications.

Keywords

» Artificial intelligence » Context window » Decoder » Llama

Two are better than one: Context window extension with multi-grained self-injection

by Wei Han, Pan Zhou, Soujanya Poria, Shuicheng Yan

Categories

GrooveSquid.com Paper Summaries

Keywords

Summary of No Free Lunch: Fundamental Limits Of Learning Non-hallucinating Generative Models, by Changlong Wu et al.

Summary of Febim: Efficient and Compact Bayesian Inference Engine Empowered with Ferroelectric In-memory Computing, by Chao Li et al.

Related Posts