Summary of Moreaupruner: Robust Pruning Of Large Language Models Against Weight Perturbations, by Zixiao Wang et al.

MoreauPruner: Robust Pruning of Large Language Models against Weight Perturbations

by Zixiao Wang, Jingwei Zhang, Wenqian Zhao, Farzan Farnia, Bei Yu

First submitted to arxiv on: 11 Jun 2024

GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty	Written by	Summary
High	Paper authors	High Difficulty Summary Read the original abstract here
Medium	GrooveSquid.com (original content)	Medium Difficulty Summary A novel few-shot gradient pruning method, MoreauPruner, is introduced to address the instability issues in large language models (LLMs). The current methods treat model weights as static values and neglect the effects of weight perturbations. However, LLMs with billions of parameters can be fragile. One-shot gradient pruning algorithms may lead to unstable results under minor errors such as data format switching between bfloat16 and float16. MoreauPruner uses optimization analysis and estimates model weight importance based on the neural network’s Moreau envelope, combined with _1-norm regularization techniques for sparsity induction. The algorithm is evaluated on several well-known LLMs, including LLaMA-7B, LLaMA-13B, LLaMA3-8B, and Vicuna-7B. Numerical results demonstrate MoreauPruner’s robustness against weight perturbations and its competitive accuracy-based scores compared to existing pruning methods.
Low	GrooveSquid.com (original content)	Low Difficulty Summary Large language models have many parameters that can make them fragile. This is a problem because current methods don’t consider how changing these parameters affects the model. One way to solve this is by using a new method called MoreauPruner. It looks at the neural network’s “envelope” to figure out which parts of the model are most important, and then uses that information to make the model more robust.

Keywords

* Artificial intelligence * Few shot * Llama * Neural network * One shot * Optimization * Pruning * Regularization

MoreauPruner: Robust Pruning of Large Language Models against Weight Perturbations

by Zixiao Wang, Jingwei Zhang, Wenqian Zhao, Farzan Farnia, Bei Yu

Categories

GrooveSquid.com Paper Summaries

Keywords

Summary of Decor: Deconfounding Time Series with Robust Regression, by Felix Schur et al.

Summary of Semantic-aware Spectrum Sharing in Internet Of Vehicles Based on Deep Reinforcement Learning, by Zhiyu Shao et al.

Related Posts