Loading Now

Summary of Are Deep Learning Models Robust to Partial Object Occlusion in Visual Recognition Tasks?, by Kaleb Kassaw et al.


Are Deep Learning Models Robust to Partial Object Occlusion in Visual Recognition Tasks?

by Kaleb Kassaw, Francesco Luzi, Leslie M. Collins, Jordan M. Malof

First submitted to arxiv on: 16 Sep 2024

Categories

  • Main: Computer Vision and Pattern Recognition (cs.CV)
  • Secondary: Artificial Intelligence (cs.AI); Machine Learning (cs.LG)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
The paper proposes a new dataset for evaluating image classification models under partial occlusion conditions. The Image Recognition Under Occlusion (IRUO) dataset uses both real-world and artificially occluded images to test the robustness of leading methods in visual recognition tasks. Additionally, the paper presents the results of a human study that evaluates human classification performance at multiple levels and types of occlusion. It finds that modern CNN-based models show improved recognition accuracy on occluded images compared to earlier CNN-based models, while ViT-based models are more accurate than CNN-based models but only modestly worse than human accuracy. The paper also highlights the importance of considering different types of occlusion, such as diffuse occlusion, where objects can be seen through “holes” in occluders.
Low GrooveSquid.com (original content) Low Difficulty Summary
The research creates a new dataset to test how well image classification models work when some parts of the images are blocked from view. This is important because many real-world images have things covering them up, like leaves or fences. The researchers make this dataset using both fake and real images that are partially occluded. They also did a study with humans to see how well they can recognize objects in these partially occluded images. The results show that newer models do better than older ones when there’s some stuff blocking the view. But even the best AI models don’t do as well as humans, especially when there are things like leaves or fences covering up parts of the image.

Keywords

» Artificial intelligence  » Classification  » Cnn  » Image classification  » Vit