Loading Now

Summary of Pat: Pruning-aware Tuning For Large Language Models, by Yijiang Liu et al.


PAT: Pruning-Aware Tuning for Large Language Models

by Yijiang Liu, Huanrui Yang, Youxin Chen, Rongyu Zhang, Miao Wang, Yuan Du, Li Du

First submitted to arxiv on: 27 Aug 2024

Categories

  • Main: Machine Learning (cs.LG)
  • Secondary: Artificial Intelligence (cs.AI); Computation and Language (cs.CL)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
Large language models excel in language tasks, especially with fine-tuning after pre-training. However, their memory and computational requirements hinder practical applications. To address this, the paper proposes the Pruning-Aware Tuning (PAT) paradigm, which incorporates structural pruning during fine-tuning to eliminate model redundancy while preserving performance. The approach uses Hybrid Sparsification Modules (HSMs) that maintain a lightweight training overhead and trainable masks to unify channel sparsification. Additionally, the Identity Loss is proposed to decouple transformation and scaling properties of HSMs for enhanced training robustness. Extensive experiments demonstrate that PAT excels in both performance and efficiency, achieving up to 1.26% accuracy improvement with a similar training cost.
Low GrooveSquid.com (original content) Low Difficulty Summary
Large language models are very good at doing language tasks, especially when they’re trained on lots of data after being pre-trained. The problem is that these models need a lot of memory and computing power, which makes it hard to use them in real-world applications. To solve this, the paper comes up with a new way of training these models called Pruning-Aware Tuning (PAT). This approach helps get rid of extra parts of the model that aren’t necessary, while keeping the important parts that make it work well. The authors also propose some new ideas to help the model learn and adapt better.

Keywords

» Artificial intelligence  » Fine tuning  » Pruning