Loading Now

Summary of Sa-dvae: Improving Zero-shot Skeleton-based Action Recognition by Disentangled Variational Autoencoders, By Sheng-wei Li and Zi-xiang Wei and Wei-jie Chen and Yi-hsin Yu and Chih-yuan Yang and Jane Yung-jen Hsu


SA-DVAE: Improving Zero-Shot Skeleton-Based Action Recognition by Disentangled Variational Autoencoders

by Sheng-Wei Li, Zi-Xiang Wei, Wei-Jie Chen, Yi-Hsin Yu, Chih-Yuan Yang, Jane Yung-jen Hsu

First submitted to arxiv on: 18 Jul 2024

Categories

  • Main: Computer Vision and Pattern Recognition (cs.CV)
  • Secondary: Machine Learning (cs.LG)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
The paper proposes a novel method, SA-DVAE (Semantic Alignment via Disentangled Variational Autoencoders), to address the imbalance in action recognition datasets. This imbalance arises from variable skeleton sequences and constant class labels. SA-DVAE disentangles skeleton features into semantic-related and irrelevant parts to align skeleton and semantic features. The method uses two modality-specific variational autoencoders with a total correction penalty. Experimental results on NTU RGB+D, NTU RGB+D 120, and PKU-MMD datasets show improved performance over existing methods.
Low GrooveSquid.com (original content) Low Difficulty Summary
SA-DVAE is a new way to recognize actions from skeletons without any prior training. Right now, action recognition methods are not very good at handling different lengths of skeleton sequences. SA-DVAE fixes this by separating the skeleton features into two parts: one that’s related to the action and another that’s not. This helps align the skeleton features with the semantic meanings (like “running” or “jumping”). The paper tests SA-DVAE on three big datasets and shows it works better than other methods.

Keywords

* Artificial intelligence  * Alignment