Summary of Lottery Ticket Adaptation: Mitigating Destructive Interference in Llms, by Ashwinee Panda et al.

Lottery Ticket Adaptation: Mitigating Destructive Interference in LLMs

by Ashwinee Panda, Berivan Isik, Xiangyu Qi, Sanmi Koyejo, Tsachy Weissman, Prateek Mittal

First submitted to arxiv on: 24 Jun 2024

GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty	Written by	Summary
High	Paper authors	High Difficulty Summary Read the original abstract here
Medium	GrooveSquid.com (original content)	Medium Difficulty Summary The proposed Lottery Ticket Adaptation (LoTA) method addresses the challenge of adapting large language models to multiple tasks simultaneously. The existing methods modify all model weights, leading to destructive interference between tasks and catastrophic forgetting of earlier tasks. LoTA identifies a sparse subnetwork of the model and optimizes it for each task, achieving better performance than full fine-tuning and low-rank adaptation (LoRA). This method also enables model merging over highly dissimilar tasks by extracting and fine-tuning lottery tickets or sparse task vectors. The evaluation on various challenging tasks, including instruction following, reasoning, math, and summarization, demonstrates the effectiveness of LoTA.
Low	GrooveSquid.com (original content)	Low Difficulty Summary LoTA is a new way to adapt language models to different tasks without losing what we already know. Right now, adapting these models can be tricky because it changes all the model’s settings, causing problems when trying to learn new things. LoTA solves this by finding and improving only parts of the model that are important for each task. This means we can teach a single model many different skills without forgetting what it already knows. The results show that LoTA works better than other methods and helps us merge information from very different tasks.

Keywords

» Artificial intelligence » Fine tuning » Lora » Low rank adaptation » Summarization

Lottery Ticket Adaptation: Mitigating Destructive Interference in LLMs

by Ashwinee Panda, Berivan Isik, Xiangyu Qi, Sanmi Koyejo, Tsachy Weissman, Prateek Mittal

Categories

GrooveSquid.com Paper Summaries

Keywords

Summary of Losing Visual Needles in Image Haystacks: Vision Language Models Are Easily Distracted in Short and Long Contexts, by Aditya Sharma et al.

Related Posts