Loading Now

Summary of Combining Relevance and Magnitude For Resource-aware Dnn Pruning, by Carla Fabiana Chiasserini and Francesco Malandrino and Nuria Molner and Zhiqiang Zhao


Combining Relevance and Magnitude for Resource-Aware DNN Pruning

by Carla Fabiana Chiasserini, Francesco Malandrino, Nuria Molner, Zhiqiang Zhao

First submitted to arxiv on: 21 May 2024

Categories

  • Main: Machine Learning (cs.LG)
  • Secondary: Neural and Evolutionary Computing (cs.NE)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
This novel paper proposes a new approach called FlexRel, which combines training-time and inference-time information to prune neural networks while retaining their accuracy. By leveraging parameter magnitude and relevance, FlexRel improves the trade-off between model accuracy and computational resources. The proposed method achieves higher pruning factors than existing approaches, resulting in significant bandwidth savings of over 35% for typical accuracy targets.
Low GrooveSquid.com (original content) Low Difficulty Summary
In this paper, researchers develop a new way to reduce neural networks’ latency by removing some parameters while keeping their accuracy. They call this approach FlexRel and it combines two types of information: how important each parameter is during training and at the time the network makes predictions. This helps improve the balance between model performance and resource usage. The results show that FlexRel can save a lot of bandwidth, over 35%, without sacrificing too much accuracy.

Keywords

» Artificial intelligence  » Inference  » Pruning