Summary of One Initialization to Rule Them All: Fine-tuning Via Explained Variance Adaptation, by Fabian Paischer et al.
One Initialization to Rule them All: Fine-tuning via Explained Variance Adaptation
by Fabian Paischer, Lukas Hauzenberger, Thomas Schmied, Benedikt Alkin, Marc Peter Deisenroth, Sepp Hochreiter
First submitted to arxiv on: 9 Oct 2024
Categories
- Main: Machine Learning (cs.LG)
- Secondary: Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (stat.ML)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary Foundation models (FMs) are pre-trained on large-scale datasets and fine-tuned for specific applications using various methods. Low-rank adaptation (LoRA) is a popular approach, which updates pre-trained weights by introducing new matrices initialized randomly or adaptively. Recent works have investigated different initialization schemes or adaptive ranks during fine-tuning, but only in isolation. This leads to slow convergence or uniform rank distribution, resulting in suboptimal performance. The proposed method, EVA (explained variance adaptation), initializes LoRA matrices using singular value decomposition (SVD) on minibatches of activation vectors and redistributes ranks among all weight matrices to store the maximum amount of information from the downstream data. EVA achieves faster convergence than competitors and achieves higher average scores across various fine-tuning tasks while reducing trainable parameters through rank redistribution. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary This research paper talks about how to make foundation models better for specific tasks. These models are pre-trained on lots of data, but then need to be adjusted for what they’re supposed to do. Right now, there are a few ways to adjust them, but they don’t always work well. The new method, called EVA, is an improvement over the current methods. It helps the model learn and adapt faster, which means it can get better at doing its job more quickly. This method has been tested on different tasks like language understanding and image classification, and it does really well. |
Keywords
» Artificial intelligence » Fine tuning » Image classification » Language understanding » Lora » Low rank adaptation