Loading Now

Summary of Dnn Memory Footprint Reduction Via Post-training Intra-layer Multi-precision Quantization, by Behnam Ghavami et al.


DNN Memory Footprint Reduction via Post-Training Intra-Layer Multi-Precision Quantization

by Behnam Ghavami, Amin Kamjoo, Lesley Shannon, Steve Wilton

First submitted to arxiv on: 3 Apr 2024

Categories

  • Main: Machine Learning (cs.LG)
  • Secondary: Artificial Intelligence (cs.AI)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
A novel technique for reducing the memory footprint of Deep Neural Networks (DNNs) is proposed in this paper, enabling accurate deployment on resource-constrained edge devices. The method, Post-Training Intra-Layer Multi-Precision Quantization (PTILMPQ), uses a post-training quantization approach that eliminates the need for extensive training data. By estimating layer and channel importance within the network, PTILMPQ enables precise bit allocation throughout the quantization process. This results in significant memory savings without sacrificing model accuracy. For instance, ResNet50 achieves 74.57% accuracy with a memory footprint of 9.5 MB, a 25.49% reduction compared to previous methods, with only a minor 1.08% decrease in accuracy.
Low GrooveSquid.com (original content) Low Difficulty Summary
This paper introduces a new way to reduce the size of Deep Neural Networks (DNNs) so they can run on devices with limited memory. This is important because many people are concerned about privacy and want their data to be processed closer to where it was collected. The new method, called Post-Training Intra-Layer Multi-Precision Quantization (PTILMPQ), helps keep the same level of accuracy as before but uses less memory.

Keywords

* Artificial intelligence  * Precision  * Quantization