Loading Now

Summary of Towards Gradient-based Time-series Explanations Through a Spatiotemporal Attention Network, by Min Hun Lee


Towards Gradient-based Time-Series Explanations through a SpatioTemporal Attention Network

by Min Hun Lee

First submitted to arxiv on: 18 May 2024

Categories

  • Main: Computer Vision and Pattern Recognition (cs.CV)
  • Secondary: Machine Learning (cs.LG)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
The paper explores the use of a transformer-based, spatiotemporal attention network (STAN) for gradient-based time-series explanations in video classifications. The authors trained the STAN model using global and local views of data with weakly supervised labels on time-series data. A gradient-based XAI technique was applied to identify salient frames of time-series data. Experimental results on four medically relevant activities demonstrate the potential of the STAN model to identify important frames of videos.
Low GrooveSquid.com (original content) Low Difficulty Summary
The paper looks at using a special kind of AI network to help explain how machines make decisions about video footage. They tested this network, called STAN, by training it to classify short clips of video based on what’s happening in them. Then they used a technique called saliency maps to figure out which parts of the video are most important for the machine’s decision. The results show that STAN is good at identifying key moments in medical videos.

Keywords

» Artificial intelligence  » Attention  » Spatiotemporal  » Supervised  » Time series  » Transformer