Summary of Basis Sharing: Cross-layer Parameter Sharing For Large Language Model Compression, by Jingcun Wang et al.
Basis Sharing: Cross-Layer Parameter Sharing for Large Language Model Compression
by Jingcun Wang, Yu-Guang Chen, Ing-Chao Lin, Bing Li, Grace Li Zhang
First submitted to arxiv on: 2 Oct 2024
Categories
- Main: Computation and Language (cs.CL)
- Secondary: Machine Learning (cs.LG)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary Large Language Models (LLMs) have achieved impressive results, but their large number of parameters require significant memory storage for inference, hindering practical deployment in many applications. To address this challenge, singular value decomposition (SVD) is a promising solution to approximate weight matrices and compress LLMs. This paper explores parameter sharing across different layers with SVD to achieve more effective compression. The approach represents weight matrices as a linear combination of shared basis vectors and unique coefficients, examining the types of weight matrices and layer selection for basis sharing when compressing LLMs. Comprehensive experiments demonstrate that Basis Sharing outperforms state-of-the-art SVD-based compression approaches and parameter sharing techniques, especially under large compression ratios. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary Large Language Models are super smart computers that can understand and generate human-like text. But these models have a big problem: they need lots of memory to work properly. To fix this, scientists use something called singular value decomposition (SVD) to make the models smaller without losing their brains. In this paper, researchers take it a step further by sharing information between different parts of the model using SVD. They show that this works really well and can even make the model better when we make it smaller. |
Keywords
» Artificial intelligence » Inference