Loading Now

Summary of Sorsa: Singular Values and Orthonormal Regularized Singular Vectors Adaptation Of Large Language Models, by Yang Cao


SORSA: Singular Values and Orthonormal Regularized Singular Vectors Adaptation of Large Language Models

by Yang Cao

First submitted to arxiv on: 21 Aug 2024

Categories

  • Main: Machine Learning (cs.LG)
  • Secondary: Computation and Language (cs.CL)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
The proposed Singular Values and Orthonormal Regularized Singular Vectors Adaptation (SORSA) method is a novel PEFT approach that combines trainable principal singular weights with frozen residual weights. These parts are initialized using singular value decomposition (SVD) on pre-trained weights, and the orthonormal regularizer helps to decrease the condition number of the principal weights, making optimization more efficient. SORSA adapters can be merged during inference, eliminating latency. The method is compared to LoRA, PiSSA, and Full FT on GSM-8K and MATH benchmarks, showing superior performance in both cases. For example, on GSM-8K, Llama 2 7B adapted using SORSA achieved 56.03% accuracy, outperforming other methods.
Low GrooveSquid.com (original content) Low Difficulty Summary
SORSA is a new way to fine-tune models without retraining them from scratch. It’s like a shortcut that helps models learn faster and better. The method uses special weights that are initialized with information from pre-trained models. This makes the model learn more efficiently and accurately. SORSA is tested on two benchmarks, GSM-8K and MATH, and performs better than other methods in both cases.

Keywords

» Artificial intelligence  » Inference  » Llama  » Lora  » Optimization