Loading Now

Summary of Peeb: Part-based Image Classifiers with An Explainable and Editable Language Bottleneck, by Thang M. Pham et al.


PEEB: Part-based Image Classifiers with an Explainable and Editable Language Bottleneck

by Thang M. Pham, Peijie Chen, Tin Nguyen, Seunghyun Yoon, Trung Bui, Anh Totti Nguyen

First submitted to arxiv on: 8 Mar 2024

Categories

  • Main: Computer Vision and Pattern Recognition (cs.CV)
  • Secondary: Artificial Intelligence (cs.AI); Computation and Language (cs.CL)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
A novel approach to fine-grained classification using CLIP-based classifiers is proposed, which relies on expressing class names into text descriptors that describe visual parts of classes. The PEEB (Part-based Editable Embeddings) classifier matches detected part embeddings with textual descriptions to compute logit scores for classification. In zero-shot settings, PEEB outperforms CLIP by a significant margin, achieving top-1 accuracy ~10x higher. On supervised-learning benchmarks CUB-200 and Dogs-120, PEEB achieves state-of-the-art (SOTA) accuracy of 88.80% and 92.20%, respectively. Additionally, PEEB enables users to edit text descriptors to form new classifiers without re-training.
Low GrooveSquid.com (original content) Low Difficulty Summary
Imagine a system that can learn new classes just by being shown examples. This is exactly what the authors propose in this paper. They introduce an AI model called PEEB (Part-based Editable Embeddings) that can classify objects into different categories, even if it has never seen those categories before. The main idea is to express class names into words that describe the visual parts of those classes. Then, the system matches these descriptions with the features it detects in an image to make a prediction. This approach works really well and beats other state-of-the-art models.

Keywords

» Artificial intelligence  » Classification  » Supervised  » Zero shot