Summary of Ov-dquo: Open-vocabulary Detr with Denoising Text Query Training and Open-world Unknown Objects Supervision, by Junjie Wang et al.

OV-DQUO: Open-Vocabulary DETR with Denoising Text Query Training and Open-World Unknown Objects Supervision

by Junjie Wang, Bin Chen, Bin Kang, Yulin Li, YiChi Chen, Weizhi Xian, Huifeng Chang, Yong Xu

First submitted to arxiv on: 28 May 2024

GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty	Written by	Summary
High	Paper authors	High Difficulty Summary Read the original abstract here
Medium	GrooveSquid.com (original content)	Medium Difficulty Summary The paper introduces OV-DQUO, an open-vocabulary detector that addresses the issue of confidence bias in existing detectors trained on base categories. The proposed model uses wildcard matching to learn from unknown objects and text embeddings with general semantics, reducing the bias between base and novel categories. A denoising text query training strategy is also introduced, which synthesizes foreground and background query-box pairs from open-world unknown objects to train the detector through contrastive learning. The OV-DQUO model achieves new state-of-the-art results on the challenging OV-COCO and OV-LVIS benchmarks, with an AP50 of 45.6 and mAP of 39.3 on novel categories respectively, without requiring additional training data.
Low	GrooveSquid.com (original content)	Low Difficulty Summary This paper is about a computer program that can recognize new objects it has never seen before. The problem is that current programs are good at recognizing objects they were trained on, but not so good at recognizing new ones. To solve this, the authors created a new program called OV-DQUO that uses special techniques to learn from unknown objects and text descriptions. This allows the program to be more accurate when recognizing new objects. The authors tested their program on two challenging datasets and achieved better results than previous programs without needing any additional training.

Keywords

» Artificial intelligence » Semantics

OV-DQUO: Open-Vocabulary DETR with Denoising Text Query Training and Open-World Unknown Objects Supervision

by Junjie Wang, Bin Chen, Bin Kang, Yulin Li, YiChi Chen, Weizhi Xian, Huifeng Chang, Yong Xu

Categories

GrooveSquid.com Paper Summaries

Keywords

Summary of Clavaddpm: Multi-relational Data Synthesis with Cluster-guided Diffusion Models, by Wei Pang et al.

Summary of Tool Learning with Large Language Models: a Survey, by Changle Qu et al.

Related Posts