Loading Now

Summary of Realizing Unaligned Block-wise Pruning For Dnn Acceleration on Mobile Devices, by Hayun Lee et al.


Realizing Unaligned Block-wise Pruning for DNN Acceleration on Mobile Devices

by Hayun Lee, Dongkun Shin

First submitted to arxiv on: 29 Jul 2024

Categories

  • Main: Machine Learning (cs.LG)
  • Secondary: Artificial Intelligence (cs.AI)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
This research paper proposes novel techniques to run computationally intensive deep neural networks (DNNs) directly on mobile devices, addressing the challenges posed by limited computing and memory resources. Specifically, it introduces unaligned block pruning (UBP), which allows for more effective model compression while minimizing accuracy drop. The authors develop a pseudo-optimal yet fast algorithm called Block Expansion and Division (BED) to select optimal blocks for UBP, enabling efficient inference kernels for mobile devices. Experimental results demonstrate the superiority of their techniques on real mobile phones using MobileNet and ResNet models.
Low GrooveSquid.com (original content) Low Difficulty Summary
This research makes it possible to use artificial intelligence (AI) directly on your phone. This is important because some AI tasks are very complex and need a lot of computer power, which can be a problem for smaller devices like phones. The researchers found a way to make the AI more efficient by “pruning” or removing parts that aren’t as important. They also developed a new way to choose the right parts to keep, called Block Expansion and Division (BED). This makes it possible to run these complex AI models on phones without sacrificing too much accuracy.

Keywords

» Artificial intelligence  » Inference  » Model compression  » Pruning  » Resnet