Summary of Open-vocabulary Object Detection Via Language Hierarchy, by Jiaxing Huang et al.

Open-Vocabulary Object Detection via Language Hierarchy

by Jiaxing Huang, Jingyi Zhang, Kai Jiang, Shijian Lu

First submitted to arxiv on: 27 Oct 2024

GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty	Written by	Summary
High	Paper authors	High Difficulty Summary Read the original abstract here
Medium	GrooveSquid.com (original content)	Medium Difficulty Summary The paper presents a new approach to weakly-supervised object detection, addressing the issue of image-to-box label mismatch in existing methods. The authors introduce Language Hierarchical Self-training (LHST), which incorporates language hierarchy into detector training to learn more generalizable detectors. LHST expands image-level labels with a hierarchical structure and enables co-regularization between expanded labels and self-training. This approach provides richer supervision, mitigates the image-to-box label mismatch, and selects reliable labels based on predicted reliability. The authors also design a prompt generation method that introduces language hierarchy to bridge vocabulary gaps between training and testing. Experimental results show that LHST achieves superior generalization performance across 14 object detection datasets.
Low	GrooveSquid.com (original content)	Low Difficulty Summary The paper is about finding a better way to teach machines to detect objects in pictures, even when the training data isn’t perfect. Right now, most methods use weak supervision, which means they rely on broad labels like “dog” or “car”, rather than precise information like “the dog’s tail”. This can lead to problems where the machine is good at detecting certain types of objects but not others. The authors propose a new approach called Language Hierarchical Self-training (LHST) that addresses this issue by providing more accurate labels and selecting the most reliable ones. They also develop a way to generate prompts that help machines learn from different types of data. Overall, the paper shows that LHST can improve object detection performance across many datasets.

Keywords

* Artificial intelligence * Generalization * Object detection * Prompt * Regularization * Self training * Supervised

Open-Vocabulary Object Detection via Language Hierarchy

by Jiaxing Huang, Jingyi Zhang, Kai Jiang, Shijian Lu

Categories

GrooveSquid.com Paper Summaries

Keywords

Summary of Effective Instruction Parsing Plugin For Complex Logical Query Answering on Knowledge Graphs, by Xingrui Zhuo et al.

Summary of Seg:seeds-enhanced Iterative Refinement Graph Neural Network For Entity Alignment, by Wei Ai et al.

Related Posts