Summary of From a Social Cognitive Perspective: Context-aware Visual Social Relationship Recognition, by Shiwei Wu et al.
From a Social Cognitive Perspective: Context-aware Visual Social Relationship Recognition
by Shiwei Wu, Chao Zhang, Joya Chen, Tong Xu, Likang Wu, Yao Hu, Enhong Chen
First submitted to arxiv on: 12 Jun 2024
Categories
- Main: Computer Vision and Pattern Recognition (cs.CV)
- Secondary: Artificial Intelligence (cs.AI)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary A novel approach to recognizing social relationships is proposed, leveraging contextual understanding and subtle visual cues. The current classification paradigm focuses on detected persons and objects, neglecting the comprehensive context and crucial social factors. To address this, a lightweight adapter is built upon frozen CLIP, incorporating social-aware semantics through a multi-modal side adapter tuning mechanism. This approach improves previous methods by 12.2% on the People-in-Social-Context (PISC) dataset and 9.8% on the People-in-Photo-Album (PIPA) benchmark. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary People’s relationships are often shown through objects or interactions, which can be challenging to recognize. Current methods focus on detected people and objects, but ignore the bigger picture and important social details. This paper proposes a new way to understand social relationships by looking at the context and subtle visual clues. It uses a special adapter that adds social meaning to an existing model, allowing it to learn about social concepts. The approach outperforms previous methods in recognizing social relationships. |
Keywords
» Artificial intelligence » Classification » Multi modal » Semantics