Summary of Human-inspired Explanations For Vision Transformers and Convolutional Neural Networks, by Mahadev Prasad Panda et al.
Human-inspired Explanations for Vision Transformers and Convolutional Neural Networks
by Mahadev Prasad Panda, Matteo Tiezzi, Martina Vilas, Gemma Roig, Bjoern M. Eskofier, Dario Zanca
First submitted to arxiv on: 4 Aug 2024
Categories
- Main: Computer Vision and Pattern Recognition (cs.CV)
- Secondary: Machine Learning (cs.LG)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary We introduce Foveation-based Explanations (FovEx), a novel method for Deep Neural Networks’ explainability, inspired by human visual perception. This approach achieves state-of-the-art performance on both transformer and convolutional models, showcasing its versatility. Our evaluation highlights the alignment between FovEx’s explanation maps and human gaze patterns (+14% NSS compared to RISE, +203% NSS compared to gradCAM), emphasizing the effectiveness in closing the interpretation gap between humans and machines. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary We created a new way to understand how deep learning models work. It’s called Foveation-based Explanations (FovEx). This approach helps us see what parts of an image or data are most important for a model’s decision. Our method is really good and can be used with different types of models, like transformer and convolutional ones. We even showed that our explanations match how humans look at things (+14% in NSS compared to another method), which makes it useful for understanding what machines are doing. |
Keywords
» Artificial intelligence » Alignment » Deep learning » Transformer