Summary of Attention Based Simple Primitives For Open World Compositional Zero-shot Learning, by Ans Munir et al.
Attention Based Simple Primitives for Open World Compositional Zero-Shot Learning
by Ans Munir, Faisal Z. Qureshi, Muhammad Haris Khan, Mohsen Ali
First submitted to arxiv on: 18 Jul 2024
Categories
- Main: Computer Vision and Pattern Recognition (cs.CV)
- Secondary: Machine Learning (cs.LG)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary In Compositional Zero-Shot Learning (CZSL), models predict unknown compositions made up of attribute and object pairs. To tackle this challenging task, we introduce Open World Compositional Zero-Shot Learning (OW-CZSL) that considers all potential combinations of attributes and objects. Our approach utilizes self-attention mechanisms between attributes and objects to improve generalization from seen to unseen compositions. We calculate the similarity between attended textual and visual features during inference, generating predictions. To restrict the test space to realistic compositions, we leverage external knowledge from ConceptNet. Our proposed Attention-based Simple Primitives (ASP) model achieves competitive performance comparable to state-of-the-art methods. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary Compositional Zero-Shot Learning aims to predict unknown combinations of attributes and objects. This is a difficult task because it needs to work with many different possibilities. We’re trying to make this easier by introducing Open World Compositional Zero-Shot Learning, which considers all possible combinations. Our method uses special attention mechanisms that help the model understand relationships between attributes and objects. During prediction, we compare the similarities of these features to make predictions. To avoid including unrealistic combinations, we use extra knowledge from ConceptNet. Our new ASP model does well and performs as well as other top models. |
Keywords
* Artificial intelligence * Attention * Generalization * Inference * Self attention * Zero shot