Summary of One Initialization to Rule Them All: Fine-tuning Via Explained Variance Adaptation, by Fabian Paischer et al.

One Initialization to Rule them All: Fine-tuning via Explained Variance Adaptation

by Fabian Paischer, Lukas Hauzenberger, Thomas Schmied, Benedikt Alkin, Marc Peter Deisenroth, Sepp Hochreiter

First submitted to arxiv on: 9 Oct 2024

GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty	Written by	Summary
High	Paper authors	High Difficulty Summary Read the original abstract here
Medium	GrooveSquid.com (original content)	Medium Difficulty Summary Foundation models (FMs) are pre-trained on large-scale datasets and fine-tuned for specific applications using various methods. Low-rank adaptation (LoRA) is a popular approach, which updates pre-trained weights by introducing new matrices initialized randomly or adaptively. Recent works have investigated different initialization schemes or adaptive ranks during fine-tuning, but only in isolation. This leads to slow convergence or uniform rank distribution, resulting in suboptimal performance. The proposed method, EVA (explained variance adaptation), initializes LoRA matrices using singular value decomposition (SVD) on minibatches of activation vectors and redistributes ranks among all weight matrices to store the maximum amount of information from the downstream data. EVA achieves faster convergence than competitors and achieves higher average scores across various fine-tuning tasks while reducing trainable parameters through rank redistribution.
Low	GrooveSquid.com (original content)	Low Difficulty Summary This research paper talks about how to make foundation models better for specific tasks. These models are pre-trained on lots of data, but then need to be adjusted for what they’re supposed to do. Right now, there are a few ways to adjust them, but they don’t always work well. The new method, called EVA, is an improvement over the current methods. It helps the model learn and adapt faster, which means it can get better at doing its job more quickly. This method has been tested on different tasks like language understanding and image classification, and it does really well.

Keywords

* Artificial intelligence * Fine tuning * Image classification * Language understanding * Lora * Low rank adaptation

One Initialization to Rule them All: Fine-tuning via Explained Variance Adaptation

by Fabian Paischer, Lukas Hauzenberger, Thomas Schmied, Benedikt Alkin, Marc Peter Deisenroth, Sepp Hochreiter

Categories

GrooveSquid.com Paper Summaries

Keywords

Summary of Complex Logical Query Answering by Calibrating Knowledge Graph Completion Models, By Changyi Xiao et al.

Summary of Can Transformers Reason Logically? a Study in Sat Solving, by Leyan Pan et al.

Related Posts