Loading Now

Summary of Towards Modular Llms by Building and Reusing a Library Of Loras, By Oleksiy Ostapenko et al.


Towards Modular LLMs by Building and Reusing a Library of LoRAs

by Oleksiy Ostapenko, Zhan Su, Edoardo Maria Ponti, Laurent Charlin, Nicolas Le Roux, Matheus Pereira, Lucas Caccia, Alessandro Sordoni

First submitted to arxiv on: 18 May 2024

Categories

  • Main: Machine Learning (cs.LG)
  • Secondary: Computation and Language (cs.CL)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
This paper investigates whether pre-trained adapters for large language models (LLMs) can be reused to improve performance on new tasks without retraining. To achieve this, the authors develop a library of adapters and devise techniques for zero-shot and supervised task generalization. They benchmark existing methods and introduce model-based clustering (MBC), which groups tasks based on adapter parameter similarity, optimizing transfer across multi-task data. The authors also present Arrow, a novel zero-shot routing mechanism that selects relevant adapters without retraining. Experiments with LLMs like Phi-2 and Mistral demonstrate superior generalization to new tasks using MBC-based adapters and Arrow routing. This work takes steps towards creating modular, adaptable LLMs that can rival or surpass traditional joint training.
Low GrooveSquid.com (original content) Low Difficulty Summary
Imagine having a super smart AI model that can help with many different tasks without needing to be retrained each time. That’s the idea behind this paper! Researchers are trying to figure out how to reuse special “adapters” they’ve already trained for one task, so they can use them to improve performance on new tasks too. They developed a way to group similar tasks together and created a system that can pick the best adapter without needing retraining. This could lead to making AI models more flexible and helpful in lots of situations.

Keywords

» Artificial intelligence  » Clustering  » Generalization  » Multi task  » Supervised  » Zero shot