Summary of Cross-task Multi-branch Vision Transformer For Facial Expression and Mask Wearing Classification, by Armando Zhu et al.
Cross-Task Multi-Branch Vision Transformer for Facial Expression and Mask Wearing Classification
by Armando Zhu, Keqin Li, Tong Wu, Peng Zhao, Bo Hong
First submitted to arxiv on: 22 Apr 2024
Categories
- Main: Computer Vision and Pattern Recognition (cs.CV)
- Secondary: Artificial Intelligence (cs.AI)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary A unified multi-branch vision transformer is proposed for facial expression recognition (FER) and mask wearing classification tasks, which extracts shared features using a dual-branch architecture. The framework reduces complexity by processing tokens for each task with separate branches and exchanging information through a cross attention module. Experimental results show that the model performs similarly to or better than state-of-the-art methods on both FER and facial mask wearing classification. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary Wearing masks is now a normal part of our daily lives, but this has created a new problem: recognizing facial expressions while people are wearing masks. This paper proposes a special kind of computer program that can do just that – recognize facial expressions and figure out if someone is wearing a mask or not. The program uses two separate parts to work on these tasks, and it shares information between them to make the job easier. This makes the program more efficient than using two separate programs for each task. The results show that this program works almost as well as the best other programs at recognizing facial expressions and figuring out if someone is wearing a mask. |
Keywords
» Artificial intelligence » Classification » Cross attention » Mask » Vision transformer