Loading Now

Summary of Zeroth-order Adaptive Neuron Alignment Based Pruning Without Re-training, by Elia Cunegatti et al.


Zeroth-Order Adaptive Neuron Alignment Based Pruning without Re-Training

by Elia Cunegatti, Leonardo Lucio Custode, Giovanni Iacca

First submitted to arxiv on: 11 Nov 2024

Categories

  • Main: Machine Learning (cs.LG)
  • Secondary: Artificial Intelligence (cs.AI); Computation and Language (cs.CL)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
The proposed algorithm, NeuroAL, is a top-up method that can be used with any pruning algorithm for Large Language Models (LLMs) to obtain sparse models that maximize neuron alignment among activations. Unlike existing methods, NeuroAL adaptively selects the best hyperparameters for block-wise and row-wise sparsity ratios based on the model and desired sparsity, requiring no re-training. The approach is tested across 276 cases combining four LLM families, three sparsity ratios, and ten language tasks, showing consistent outperformance of state-of-the-art methods in terms of performance-runtime trade-off.
Low GrooveSquid.com (original content) Low Difficulty Summary
NeuroAL is a new way to make large language models smaller and faster without losing their ability to understand and generate text. Normally, making these models smaller would require re-training them, but NeuroAL does it without needing to start over. This approach works well with different types of language models and tasks, and it’s better than current methods at balancing how well the model performs with how fast it is.

Keywords

» Artificial intelligence  » Alignment  » Pruning