Summary of Sorsa: Singular Values and Orthonormal Regularized Singular Vectors Adaptation Of Large Language Models, by Yang Cao

SORSA: Singular Values and Orthonormal Regularized Singular Vectors Adaptation of Large Language Models

by Yang Cao

First submitted to arxiv on: 21 Aug 2024

GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty	Written by	Summary
High	Paper authors	High Difficulty Summary Read the original abstract here
Medium	GrooveSquid.com (original content)	Medium Difficulty Summary The proposed Singular Values and Orthonormal Regularized Singular Vectors Adaptation (SORSA) method is a novel PEFT approach that combines trainable principal singular weights with frozen residual weights. These parts are initialized using singular value decomposition (SVD) on pre-trained weights, and the orthonormal regularizer helps to decrease the condition number of the principal weights, making optimization more efficient. SORSA adapters can be merged during inference, eliminating latency. The method is compared to LoRA, PiSSA, and Full FT on GSM-8K and MATH benchmarks, showing superior performance in both cases. For example, on GSM-8K, Llama 2 7B adapted using SORSA achieved 56.03% accuracy, outperforming other methods.
Low	GrooveSquid.com (original content)	Low Difficulty Summary SORSA is a new way to fine-tune models without retraining them from scratch. It’s like a shortcut that helps models learn faster and better. The method uses special weights that are initialized with information from pre-trained models. This makes the model learn more efficiently and accurately. SORSA is tested on two benchmarks, GSM-8K and MATH, and performs better than other methods in both cases.

Keywords

* Artificial intelligence * Inference * Llama * Lora * Optimization

SORSA: Singular Values and Orthonormal Regularized Singular Vectors Adaptation of Large Language Models

by Yang Cao

Categories

GrooveSquid.com Paper Summaries

Keywords

Summary of Neural Crns: a Natural Implementation Of Learning in Chemical Reaction Networks, by Rajiv Teja Nagipogu et al.

Summary of Are Llm-based Methods Good Enough For Detecting Unfair Terms Of Service?, by Mirgita Frasheri et al.

Related Posts