Summary of Sa-dvae: Improving Zero-shot Skeleton-based Action Recognition by Disentangled Variational Autoencoders, By Sheng-wei Li and Zi-xiang Wei and Wei-jie Chen and Yi-hsin Yu and Chih-yuan Yang and Jane Yung-jen Hsu
SA-DVAE: Improving Zero-Shot Skeleton-Based Action Recognition by Disentangled Variational Autoencoders
by Sheng-Wei Li, Zi-Xiang Wei, Wei-Jie Chen, Yi-Hsin Yu, Chih-Yuan Yang, Jane Yung-jen Hsu
First submitted to arxiv on: 18 Jul 2024
Categories
- Main: Computer Vision and Pattern Recognition (cs.CV)
- Secondary: Machine Learning (cs.LG)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary The paper proposes a novel method, SA-DVAE (Semantic Alignment via Disentangled Variational Autoencoders), to address the imbalance in action recognition datasets. This imbalance arises from variable skeleton sequences and constant class labels. SA-DVAE disentangles skeleton features into semantic-related and irrelevant parts to align skeleton and semantic features. The method uses two modality-specific variational autoencoders with a total correction penalty. Experimental results on NTU RGB+D, NTU RGB+D 120, and PKU-MMD datasets show improved performance over existing methods. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary SA-DVAE is a new way to recognize actions from skeletons without any prior training. Right now, action recognition methods are not very good at handling different lengths of skeleton sequences. SA-DVAE fixes this by separating the skeleton features into two parts: one that’s related to the action and another that’s not. This helps align the skeleton features with the semantic meanings (like “running” or “jumping”). The paper tests SA-DVAE on three big datasets and shows it works better than other methods. |
Keywords
* Artificial intelligence * Alignment