Summary of Localize-and-stitch: Efficient Model Merging Via Sparse Task Arithmetic, by Yifei He et al.

Localize-and-Stitch: Efficient Model Merging via Sparse Task Arithmetic

by Yifei He, Yuzheng Hu, Yong Lin, Tong Zhang, Han Zhao

First submitted to arxiv on: 24 Aug 2024

GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty	Written by	Summary
High	Paper authors	High Difficulty Summary Read the original abstract here
Medium	GrooveSquid.com (original content)	Medium Difficulty Summary The paper introduces Localize-and-Stitch, a novel approach to model merging that preserves the strengths of multiple finetuned models. Traditional methods merge models globally, leading to task interference and degraded performance. The proposed algorithm works in two steps: localization, identifying tiny regions containing essential skills for downstream tasks, and stitching, reintegrating these regions into a pretrained model for task synergy. The approach is evaluated on various vision and language benchmarks, outperforming existing methods under different data availability scenarios. Localize-and-Stitch also facilitates model compression and preserves pretrained knowledge, enabling flexible and continual skill composition from multiple finetuned models with minimal storage and computational overhead.
Low	GrooveSquid.com (original content)	Low Difficulty Summary The paper introduces a new way to combine the strengths of many small models into one big model that works well for specific tasks. Right now, we combine models by adding or subtracting numbers across all their parameters. This can sometimes make the combined model worse than the original ones. The new approach finds tiny parts of each model that are important and combines them in a way that lets the combined model work better. It does this by finding these important parts (called “localization”) and then putting them together with the rest of the big model (called “stitching”). This approach is tested on many different kinds of problems, like recognizing objects or understanding language, and it performs well compared to other approaches.

Keywords

* Artificial intelligence * Model compression

Localize-and-Stitch: Efficient Model Merging via Sparse Task Arithmetic

by Yifei He, Yuzheng Hu, Yong Lin, Tong Zhang, Han Zhao

Categories

GrooveSquid.com Paper Summaries

Keywords

Summary of Reactzyme: a Benchmark For Enzyme-reaction Prediction, by Chenqing Hua et al.

Summary of Outlier Detection Bias Busted: Understanding Sources Of Algorithmic Bias Through Data-centric Factors, by Xueying Ding et al.

Related Posts