Loading Now

Summary of Edge Ai: Evaluation Of Model Compression Techniques For Convolutional Neural Networks, by Samer Francy et al.


Edge AI: Evaluation of Model Compression Techniques for Convolutional Neural Networks

by Samer Francy, Raghubir Singh

First submitted to arxiv on: 2 Sep 2024

Categories

  • Main: Machine Learning (cs.LG)
  • Secondary: Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
This paper explores various compression techniques to reduce the size and computational complexity of ConvNeXt models in image classification tasks using the CIFAR-10 dataset. The authors evaluate structured pruning, unstructured pruning, and dynamic quantization methods on cloud-based platforms and edge devices. Results show significant reductions in model size, with up to 75% achieved using structured pruning techniques, while dynamic quantization achieves a reduction of up to 95% in the number of parameters. Fine-tuning pre-trained models enhances compression performance, indicating benefits from combining pre-training and compression techniques. The authors also investigate unstructured pruning methods, revealing trends in accuracy and compression with limited reductions in computational complexity. Combining OTOV3 pruning and dynamic quantization further improves compression performance, resulting in a 89.7% reduction in size, 95% reduction in number of parameters and MACs, and a 3.8% increase in accuracy. The deployment of the final compressed model on an edge device demonstrates high accuracy (92.5%) and low inference time (20 ms), validating the effectiveness of compression techniques for real-world edge computing applications.
Low GrooveSquid.com (original content) Low Difficulty Summary
This paper is about making AI models smaller and faster so they can run on devices like smartphones or smart home devices. The researchers tested different ways to shrink these models, called ConvNeXt, using a special dataset with pictures. They found that some methods work better than others, especially when combining two techniques. This helps the model learn more efficiently and makes it possible for it to run quickly and accurately on edge devices. The results show that the compressed models can be up to 89% smaller while still being very accurate (92.5%). This is important because it means we can use these AI models in real-world applications, like recognizing objects or understanding speech.

Keywords

» Artificial intelligence  » Fine tuning  » Image classification  » Inference  » Pruning  » Quantization