Loading Now

Summary of At-snn: Adaptive Tokens For Vision Transformer on Spiking Neural Network, by Donghwa Kang et al.


AT-SNN: Adaptive Tokens for Vision Transformer on Spiking Neural Network

by Donghwa Kang, Youngmoon Lee, Eun-Kyu Lee, Brent Kang, Jinkyu Lee, Hyeongboo Baek

First submitted to arxiv on: 22 Aug 2024

Categories

  • Main: Artificial Intelligence (cs.AI)
  • Secondary: Computer Vision and Pattern Recognition (cs.CV)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
This paper proposes a novel approach called AT-SNN that combines direct training and adaptive computation time (ACT) for spiking neural networks (SNNs)-based vision transformers (ViTs). The goal is to reduce power consumption while maintaining high accuracy. Building on existing methods, the authors adapt ACT from recurrent neural networks (RNNs) and ViTs to SNN-based ViTs, allowing for selective discarding of less informative spatial tokens. Additionally, a token-merge mechanism is introduced that relies on token similarity to further reduce the number of tokens while enhancing accuracy. The AT-SNN is implemented on Spikformer and evaluated on image classification tasks, demonstrating improved energy efficiency and accuracy compared to state-of-the-art approaches. Specifically, it uses up to 42.4% fewer tokens than the existing best-performing method on CIFAR-100, while maintaining higher accuracy.
Low GrooveSquid.com (original content) Low Difficulty Summary
This paper is about making computer vision models more efficient and accurate. It proposes a new way of processing information that combines two ideas: training the model directly and adjusting how much information it processes during use. This approach is called AT-SNN and it’s designed for special types of neural networks called spiking neural networks (SNNs) that are used for vision tasks like image classification. The authors test their approach on several benchmark datasets, including CIFAR10, CIFAR-100, and TinyImageNet, and show that it uses less energy while still achieving high accuracy.

Keywords

* Artificial intelligence  * Image classification  * Token