Summary of Sorsa: Singular Values and Orthonormal Regularized Singular Vectors Adaptation Of Large Language Models, by Yang Cao
SORSA: Singular Values and Orthonormal Regularized Singular Vectors Adaptation of Large Language Models
by Yang Cao
First submitted to arxiv on: 21 Aug 2024
Categories
- Main: Machine Learning (cs.LG)
- Secondary: Computation and Language (cs.CL)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary The proposed Singular Values and Orthonormal Regularized Singular Vectors Adaptation (SORSA) method is a novel PEFT approach that combines trainable principal singular weights with frozen residual weights. These parts are initialized using singular value decomposition (SVD) on pre-trained weights, and the orthonormal regularizer helps to decrease the condition number of the principal weights, making optimization more efficient. SORSA adapters can be merged during inference, eliminating latency. The method is compared to LoRA, PiSSA, and Full FT on GSM-8K and MATH benchmarks, showing superior performance in both cases. For example, on GSM-8K, Llama 2 7B adapted using SORSA achieved 56.03% accuracy, outperforming other methods. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary SORSA is a new way to fine-tune models without retraining them from scratch. It’s like a shortcut that helps models learn faster and better. The method uses special weights that are initialized with information from pre-trained models. This makes the model learn more efficiently and accurately. SORSA is tested on two benchmarks, GSM-8K and MATH, and performs better than other methods in both cases. |
Keywords
» Artificial intelligence » Inference » Llama » Lora » Optimization