Summary of Two Are Better Than One: Context Window Extension with Multi-grained Self-injection, by Wei Han et al.
Two are better than one: Context window extension with multi-grained self-injection
by Wei Han, Pan Zhou, Soujanya Poria, Shuicheng Yan
First submitted to arxiv on: 25 Oct 2024
Categories
- Main: Computation and Language (cs.CL)
- Secondary: Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary The proposed SharedLLM approach addresses the limited context window issue in large language models (LLMs) by introducing a novel multi-grained context compression and query-aware information retrieval mechanism. This architecture comprises two short-context LLMs, including the well-known LLaMA-2 model, which serves as an upper model and lower model. The lower model functions as a compressor, while the upper model acts as a decoder, performing context-aware modeling on running text. A tree-style data structure is introduced to efficiently encode, store, and retrieve multi-grained contextual information for text chunks. This structure enables rapid encoding and retrieval of relevant information from various levels based on input queries. The self-injection process, derived from the same LLM layer, facilitates efficient information transfer. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary SharedLLM is a new approach that helps large language models (LLMs) remember more context. Currently, these models can only understand a limited amount of text at a time. To fix this issue, SharedLLM uses two small models to compress and retrieve contextual information. This makes it faster and cheaper to use LLMs for many applications. |
Keywords
» Artificial intelligence » Context window » Decoder » Llama