Summary of Peeb: Part-based Image Classifiers with An Explainable and Editable Language Bottleneck, by Thang M. Pham et al.

PEEB: Part-based Image Classifiers with an Explainable and Editable Language Bottleneck

by Thang M. Pham, Peijie Chen, Tin Nguyen, Seunghyun Yoon, Trung Bui, Anh Totti Nguyen

First submitted to arxiv on: 8 Mar 2024

GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty	Written by	Summary
High	Paper authors	High Difficulty Summary Read the original abstract here
Medium	GrooveSquid.com (original content)	Medium Difficulty Summary A novel approach to fine-grained classification using CLIP-based classifiers is proposed, which relies on expressing class names into text descriptors that describe visual parts of classes. The PEEB (Part-based Editable Embeddings) classifier matches detected part embeddings with textual descriptions to compute logit scores for classification. In zero-shot settings, PEEB outperforms CLIP by a significant margin, achieving top-1 accuracy ~10x higher. On supervised-learning benchmarks CUB-200 and Dogs-120, PEEB achieves state-of-the-art (SOTA) accuracy of 88.80% and 92.20%, respectively. Additionally, PEEB enables users to edit text descriptors to form new classifiers without re-training.
Low	GrooveSquid.com (original content)	Low Difficulty Summary Imagine a system that can learn new classes just by being shown examples. This is exactly what the authors propose in this paper. They introduce an AI model called PEEB (Part-based Editable Embeddings) that can classify objects into different categories, even if it has never seen those categories before. The main idea is to express class names into words that describe the visual parts of those classes. Then, the system matches these descriptions with the features it detects in an image to make a prediction. This approach works really well and beats other state-of-the-art models.

Keywords

* Artificial intelligence * Classification * Supervised * Zero shot

PEEB: Part-based Image Classifiers with an Explainable and Editable Language Bottleneck

by Thang M. Pham, Peijie Chen, Tin Nguyen, Seunghyun Yoon, Trung Bui, Anh Totti Nguyen

Categories

GrooveSquid.com Paper Summaries

Keywords

Summary of Chatuie: Exploring Chat-based Unified Information Extraction Using Large Language Models, by Jun Xu et al.

Summary of A Feature-based Generalizable Prediction Model For Both Perceptual and Abstract Reasoning, by Quan Do et al.

Related Posts