Loading Now

Summary of A Qoe-aware Split Inference Accelerating Algorithm For Noma-based Edge Intelligence, by Xin Yuan et al.


A QoE-Aware Split Inference Accelerating Algorithm for NOMA-based Edge Intelligence

by Xin Yuan, Ning Li, Quan Chen, Wenchao Xu, Zhaoxin Zhang, Song Guo

First submitted to arxiv on: 25 Sep 2024

Categories

  • Main: Machine Learning (cs.LG)
  • Secondary: None

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
The proposed model split inference approach improves edge intelligence by dividing AI models into sub-models, offloading resource-intensive ones to an edge server for reduced latency and resource requirements. However, existing works focus on quality-of-service (QoS) while neglecting quality-of-experience (QoE), a critical aspect for users. To address this gap, the authors propose an effective resource allocation algorithm (ERA) that balances inference delay, QoE, and resource consumption to achieve optimal model split and resource allocation strategies. ERA uses gradient descent-based optimization to find the tradeoff between these factors, while also addressing complexity concerns through loop iteration GD approach. The experimental results demonstrate significant performance improvements over previous studies.
Low GrooveSquid.com (original content) Low Difficulty Summary
AI models are too big for edge devices! To fix this, researchers suggest splitting AI models into smaller pieces and sending the heavy ones to a server wirelessly. But existing solutions only care about how fast the work gets done (QoS) and ignore how good it feels for users (QoE). This paper proposes an algorithm that balances QoE with speed and resource usage to make edge intelligence better. It uses special math tricks to find the right balance and shows promising results in tests.

Keywords

» Artificial intelligence  » Gradient descent  » Inference  » Optimization