Summary of Idempotent Unsupervised Representation Learning For Skeleton-based Action Recognition, by Lilang Lin et al.
Idempotent Unsupervised Representation Learning for Skeleton-Based Action Recognition
by Lilang Lin, Lehong Wu, Jiahang Zhang, Jiaying Liu
First submitted to arxiv on: 27 Oct 2024
Categories
- Main: Computer Vision and Pattern Recognition (cs.CV)
- Secondary: Artificial Intelligence (cs.AI)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary A novel skeleton-based idempotent generative model (IGM) is proposed for unsupervised representation learning in skeleton-based action recognition tasks. Existing pre-trained generative methods contain redundant information unrelated to recognition, which contradicts the spatially sparse and temporally consistent properties of skeletons. IGM addresses this challenge by theoretically demonstrating the equivalence between generative models and maximum entropy coding, introducing contrastive learning to make features more compact. The idempotency constraint is introduced to form a stronger consistency regularization in the feature space, pushing features to maintain critical information for motion semantics recognition. Experimental results on NTU RGB+D and PKUMMD datasets demonstrate IGM’s effectiveness, with a performance improvement from 84.6% to 86.2% on the NTU 60 xsub dataset. Zero-shot adaptation scenarios also show promising results. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary A new way of learning features for recognizing actions in videos is presented. This method uses generative models, which are usually good at generating new data, but can also be used for recognizing actions. The challenge is that existing methods contain extra information that’s not useful for recognition. The proposed method, IGM, addresses this by making the features more compact and focused on what’s important for action recognition. Results show that IGM performs better than other methods on several datasets. |
Keywords
» Artificial intelligence » Generative model » Regularization » Representation learning » Semantics » Unsupervised » Zero shot