Summary of Surgeryv2: Bridging the Gap Between Model Merging and Multi-task Learning with Deep Representation Surgery, by Enneng Yang et al.

SurgeryV2: Bridging the Gap Between Model Merging and Multi-Task Learning with Deep Representation Surgery

by Enneng Yang, Li Shen, Zhenyi Wang, Guibing Guo, Xingwei Wang, Xiaocun Cao, Jie Zhang, Dacheng Tao

First submitted to arxiv on: 18 Oct 2024

GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty	Written by	Summary
High	Paper authors	High Difficulty Summary Read the original abstract here
Medium	GrooveSquid.com (original content)	Medium Difficulty Summary The proposed Model Merging-based Multitask Learning (MTL) approach, which combines multiple expert models without requiring access to raw training data, is a promising solution for performing MTL. However, the merged model’s representation distribution exhibits a critical issue of “representation bias”, which arises from a significant gap between the representations of the merged and expert models, leading to suboptimal performance. To address this challenge, the authors propose a lightweight, task-specific module called Surgery that aligns the final layer representations of the merged model with those of the expert models, improving performance. Although this solution reduces bias, a performance gap remains compared to traditional MTL methods. Further analysis reveals that representation bias exists across all layers, so the authors propose a more comprehensive solution, deep representation surgery (SurgeryV2), which mitigates bias across all layers. The authors design an unsupervised optimization objective to optimize both Surgery and SurgeryV2 modules. Experimental results show that incorporating these modules into state-of-the-art model merging schemes leads to significant performance gains.
Low	GrooveSquid.com (original content)	Low Difficulty Summary Model merging-based multitask learning offers a promising approach for performing MTL without requiring access to raw training data. However, the merged model’s representation distribution exhibits a critical issue of “representation bias”. This bias arises from a gap between the representations of the merged and expert models, leading to suboptimal performance. The authors propose lightweight modules called Surgery and deep representation surgery (SurgeryV2) that align and mitigate this bias. These modules are designed for unsupervised optimization and experimental results show significant performance gains.

Keywords

* Artificial intelligence * Optimization * Unsupervised

SurgeryV2: Bridging the Gap Between Model Merging and Multi-Task Learning with Deep Representation Surgery

by Enneng Yang, Li Shen, Zhenyi Wang, Guibing Guo, Xingwei Wang, Xiaocun Cao, Jie Zhang, Dacheng Tao

Categories

GrooveSquid.com Paper Summaries

Keywords

Summary of Debug Smarter, Not Harder: Ai Agents For Error Resolution in Computational Notebooks, by Konstantin Grotov et al.

Summary of Predicting Time-varying Flux and Balance in Metabolic Systems Using Structured Neural-ode Processes, by Santanu Rathod et al.

Related Posts