Summary of Optimising Tinyml with Quantization and Distillation Of Transformer and Mamba Models For Indoor Localisation on Edge Devices, by Thanaphon Suwannaphong et al.

Optimising TinyML with Quantization and Distillation of Transformer and Mamba Models for Indoor Localisation on Edge Devices

by Thanaphon Suwannaphong, Ferdian Jovan, Ian Craddock, Ryan McConville

First submitted to arxiv on: 12 Dec 2024

GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty	Written by	Summary
High	Paper authors	High Difficulty Summary Read the original abstract here
Medium	GrooveSquid.com (original content)	Medium Difficulty Summary The proposed research introduces small and efficient machine learning models (TinyML) for on-device indoor localisation on resource-constrained edge devices. The primary goal is to move typical approaches from centralised remote processing to the edge device itself, offering benefits such as increased battery life, enhanced privacy, reduced latency, and lowered operational costs. To achieve this, model compression techniques like quantization and knowledge distillation are employed to significantly reduce the model size while maintaining high predictive performance. The study focuses on deploying a large state-of-the-art transformer-based model within low-power MCUs and proposes a state-space-based architecture using Mamba as an alternative to the transformer. Experimental results demonstrate that the quantized transformer model performs well under 64 KB RAM constraints, achieving a balance between model size and localisation precision. Furthermore, the compact Mamba model shows strong performance even with 32 KB of RAM without the need for model compression.
Low	GrooveSquid.com (original content)	Low Difficulty Summary TinyML models are tiny versions of big machine learning models that can run on devices like smartwatches or fitness trackers. These devices don’t have a lot of power or storage, so we need to make the models smaller and more efficient. This paper shows how to do this using special techniques called quantization and knowledge distillation. The goal is to get these tiny models working well even with very limited resources, like 64 KB of RAM. If we can make it work, it could be really useful for things like tracking patient movement in hospitals.

Keywords

» Artificial intelligence » Knowledge distillation » Machine learning » Model compression » Precision » Quantization » Tracking » Transformer

Optimising TinyML with Quantization and Distillation of Transformer and Mamba Models for Indoor Localisation on Edge Devices

by Thanaphon Suwannaphong, Ferdian Jovan, Ian Craddock, Ryan McConville

Categories

GrooveSquid.com Paper Summaries

Keywords

Summary of How to Re-enable Pde Loss For Physical Systems Modeling Under Partial Observation, by Haodong Feng et al.

Summary of Mos: Model Surgery For Pre-trained Model-based Class-incremental Learning, by Hai-long Sun et al.

Related Posts