Loading Now

Summary of Basis Sharing: Cross-layer Parameter Sharing For Large Language Model Compression, by Jingcun Wang et al.


Basis Sharing: Cross-Layer Parameter Sharing for Large Language Model Compression

by Jingcun Wang, Yu-Guang Chen, Ing-Chao Lin, Bing Li, Grace Li Zhang

First submitted to arxiv on: 2 Oct 2024

Categories

  • Main: Computation and Language (cs.CL)
  • Secondary: Machine Learning (cs.LG)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
Large Language Models (LLMs) have achieved impressive results, but their large number of parameters require significant memory storage for inference, hindering practical deployment in many applications. To address this challenge, singular value decomposition (SVD) is a promising solution to approximate weight matrices and compress LLMs. This paper explores parameter sharing across different layers with SVD to achieve more effective compression. The approach represents weight matrices as a linear combination of shared basis vectors and unique coefficients, examining the types of weight matrices and layer selection for basis sharing when compressing LLMs. Comprehensive experiments demonstrate that Basis Sharing outperforms state-of-the-art SVD-based compression approaches and parameter sharing techniques, especially under large compression ratios.
Low GrooveSquid.com (original content) Low Difficulty Summary
Large Language Models are super smart computers that can understand and generate human-like text. But these models have a big problem: they need lots of memory to work properly. To fix this, scientists use something called singular value decomposition (SVD) to make the models smaller without losing their brains. In this paper, researchers take it a step further by sharing information between different parts of the model using SVD. They show that this works really well and can even make the model better when we make it smaller.

Keywords

» Artificial intelligence  » Inference