Summary of Sleb: Streamlining Llms Through Redundancy Verification and Elimination Of Transformer Blocks, by Jiwon Song et al.
SLEB: Streamlining LLMs through Redundancy Verification and Elimination of Transformer Blocks
by Jiwon Song, Kyungseok Oh, Taesu Kim, Hyungjun Kim, Yulhwa Kim, Jae-Joon Kim
First submitted to arxiv on: 14 Feb 2024
Categories
- Main: Computation and Language (cs.CL)
- Secondary: Machine Learning (cs.LG)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary The paper introduces SLEB, a novel approach to streamline large language models (LLMs) by eliminating redundant transformer blocks. By choosing the transformer block as the fundamental unit for pruning, SLEB effectively enhances LLM inference speed while maintaining superior perplexity and accuracy. The authors demonstrate that SLEB outperforms previous LLM pruning methods in accelerating LLM inference. This technique has promising implications for enhancing the efficiency of LLMs. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary SLEB is a new way to make language models faster by getting rid of unnecessary parts. It works by looking at how similar nearby parts of the model are and removing the ones that don’t add much value. The results show that SLEB makes language models run faster without sacrificing their ability to understand text well. |
Keywords
* Artificial intelligence * Inference * Perplexity * Pruning * Transformer