Summary of A Comparison Between Humans and Ai at Recognizing Objects in Unusual Poses, by Netta Ollikka et al.
A comparison between humans and AI at recognizing objects in unusual poses
by Netta Ollikka, Amro Abbas, Andrea Perin, Markku Kilpeläinen, Stéphane Deny
First submitted to arxiv on: 6 Feb 2024
Categories
- Main: Computer Vision and Pattern Recognition (cs.CV)
- Secondary: Machine Learning (cs.LG)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary Deep learning models have made significant progress in object recognition, but they still struggle with images featuring objects in unusual poses. This paper investigates this gap by comparing human and AI performance on recognizing such challenging images. While state-of-the-art deep networks like EfficientNet, SWAG, ViT, SWIN, BEiT, and ConvNext are brittle when faced with unusual poses, Gemini shows excellent robustness. Interestingly, as image exposure time is limited, human performance degrades to the level of deep networks, suggesting that additional mental processes are required to recognize objects in unusual poses. The analysis reveals that humans and deep networks rely on different mechanisms for recognizing objects, highlighting the importance of understanding the mental processes involved during extra viewing time to reproduce the robustness of human vision. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary Imagine trying to recognize an object when it’s not in a normal position. For example, if you’re used to seeing dogs sitting or standing on all fours, but suddenly you see one jumping through the air. Humans are great at recognizing objects even when they’re in unusual positions. But computer programs that use deep learning aren’t as good at this. They struggle with recognizing objects when they’re not in their usual places. This paper compares how well humans and computers do at recognizing objects in unusual poses and finds that they rely on different ways of thinking. The study suggests that there may be a way to make computers more like humans, but it will require understanding the mental processes that happen when we look at things for longer than usual. |
Keywords
* Artificial intelligence * Deep learning * Gemini * Vit