Summary of Swapnet: Efficient Swapping For Dnn Inference on Edge Ai Devices Beyond the Memory Budget, by Kun Wang et al.

SwapNet: Efficient Swapping for DNN Inference on Edge AI Devices Beyond the Memory Budget

by Kun Wang, Jiani Cao, Zimu Zhou, Zhenjiang Li

First submitted to arxiv on: 30 Jan 2024

GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty	Written by	Summary
High	Paper authors	High Difficulty Summary Read the original abstract here
Medium	GrooveSquid.com (original content)	Medium Difficulty Summary This paper addresses the challenge of executing deep neural networks (DNNs) on edge artificial intelligence (AI) devices, which are limited by their memory budgets. Existing solutions like model compression or cloud offloading reduce the memory footprint but compromise accuracy and autonomy. To overcome this limitation, the authors propose a novel approach that divides DNNs into blocks and swaps them in and out to execute large DNNs within a small memory budget. They develop SwapNet, an efficient middleware for edge AI devices that eliminates unnecessary memory operations while maintaining compatibility with deep learning frameworks, GPU backends, and hardware architectures. The paper demonstrates the effectiveness of SwapNet through a multi-DNN scheduling scheme on eleven inference tasks across three applications, achieving almost the same latency as when sufficient memory is available.
Low	GrooveSquid.com (original content)	Low Difficulty Summary This paper talks about using artificial intelligence (AI) on devices like smartphones or tablets. These devices have limited memory, which makes it hard to run complex AI models. Right now, people solve this problem by either making the model smaller or sending the computation to a cloud server. But these solutions have drawbacks – they can make the AI less accurate or less autonomous. The authors of this paper came up with a new idea: divide the AI model into smaller blocks and switch them in and out as needed. They created a special software called SwapNet that makes this process efficient and compatible with existing technologies. This means you can run larger AI models on your device without compromising performance.

Keywords

* Artificial intelligence * Deep learning * Inference * Model compression

SwapNet: Efficient Swapping for DNN Inference on Edge AI Devices Beyond the Memory Budget

by Kun Wang, Jiani Cao, Zimu Zhou, Zhenjiang Li

Categories

GrooveSquid.com Paper Summaries

Keywords

Summary of One-step Forward and Backtrack: Overcoming Zig-zagging in Loss-aware Quantization Training, by Lianbo Ma et al.

Summary of Detection and Recovery Against Deep Neural Network Fault Injection Attacks Based on Contrastive Learning, by Chenan Wang et al.

Related Posts