Loading Now

Summary of Localize-and-stitch: Efficient Model Merging Via Sparse Task Arithmetic, by Yifei He et al.


Localize-and-Stitch: Efficient Model Merging via Sparse Task Arithmetic

by Yifei He, Yuzheng Hu, Yong Lin, Tong Zhang, Han Zhao

First submitted to arxiv on: 24 Aug 2024

Categories

  • Main: Machine Learning (cs.LG)
  • Secondary: Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
The paper introduces Localize-and-Stitch, a novel approach to model merging that preserves the strengths of multiple finetuned models. Traditional methods merge models globally, leading to task interference and degraded performance. The proposed algorithm works in two steps: localization, identifying tiny regions containing essential skills for downstream tasks, and stitching, reintegrating these regions into a pretrained model for task synergy. The approach is evaluated on various vision and language benchmarks, outperforming existing methods under different data availability scenarios. Localize-and-Stitch also facilitates model compression and preserves pretrained knowledge, enabling flexible and continual skill composition from multiple finetuned models with minimal storage and computational overhead.
Low GrooveSquid.com (original content) Low Difficulty Summary
The paper introduces a new way to combine the strengths of many small models into one big model that works well for specific tasks. Right now, we combine models by adding or subtracting numbers across all their parameters. This can sometimes make the combined model worse than the original ones. The new approach finds tiny parts of each model that are important and combines them in a way that lets the combined model work better. It does this by finding these important parts (called “localization”) and then putting them together with the rest of the big model (called “stitching”). This approach is tested on many different kinds of problems, like recognizing objects or understanding language, and it performs well compared to other approaches.

Keywords

» Artificial intelligence  » Model compression