Summary of Dnn Memory Footprint Reduction Via Post-training Intra-layer Multi-precision Quantization, by Behnam Ghavami et al.

DNN Memory Footprint Reduction via Post-Training Intra-Layer Multi-Precision Quantization

by Behnam Ghavami, Amin Kamjoo, Lesley Shannon, Steve Wilton

First submitted to arxiv on: 3 Apr 2024

GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty	Written by	Summary
High	Paper authors	High Difficulty Summary Read the original abstract here
Medium	GrooveSquid.com (original content)	Medium Difficulty Summary A novel technique for reducing the memory footprint of Deep Neural Networks (DNNs) is proposed in this paper, enabling accurate deployment on resource-constrained edge devices. The method, Post-Training Intra-Layer Multi-Precision Quantization (PTILMPQ), uses a post-training quantization approach that eliminates the need for extensive training data. By estimating layer and channel importance within the network, PTILMPQ enables precise bit allocation throughout the quantization process. This results in significant memory savings without sacrificing model accuracy. For instance, ResNet50 achieves 74.57% accuracy with a memory footprint of 9.5 MB, a 25.49% reduction compared to previous methods, with only a minor 1.08% decrease in accuracy.
Low	GrooveSquid.com (original content)	Low Difficulty Summary This paper introduces a new way to reduce the size of Deep Neural Networks (DNNs) so they can run on devices with limited memory. This is important because many people are concerned about privacy and want their data to be processed closer to where it was collected. The new method, called Post-Training Intra-Layer Multi-Precision Quantization (PTILMPQ), helps keep the same level of accuracy as before but uses less memory.

Keywords

* Artificial intelligence * Precision * Quantization

DNN Memory Footprint Reduction via Post-Training Intra-Layer Multi-Precision Quantization

by Behnam Ghavami, Amin Kamjoo, Lesley Shannon, Steve Wilton

Categories

GrooveSquid.com Paper Summaries

Keywords

Summary of Optimizing the Deployment Of Tiny Transformers on Low-power Mcus, by Victor J.b. Jung et al.

Summary of Pissa: Principal Singular Values and Singular Vectors Adaptation Of Large Language Models, by Fanxu Meng et al.

Related Posts