Loading Now

Summary of Ov-dquo: Open-vocabulary Detr with Denoising Text Query Training and Open-world Unknown Objects Supervision, by Junjie Wang et al.


OV-DQUO: Open-Vocabulary DETR with Denoising Text Query Training and Open-World Unknown Objects Supervision

by Junjie Wang, Bin Chen, Bin Kang, Yulin Li, YiChi Chen, Weizhi Xian, Huifeng Chang, Yong Xu

First submitted to arxiv on: 28 May 2024

Categories

  • Main: Computer Vision and Pattern Recognition (cs.CV)
  • Secondary: Artificial Intelligence (cs.AI)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
The paper introduces OV-DQUO, an open-vocabulary detector that addresses the issue of confidence bias in existing detectors trained on base categories. The proposed model uses wildcard matching to learn from unknown objects and text embeddings with general semantics, reducing the bias between base and novel categories. A denoising text query training strategy is also introduced, which synthesizes foreground and background query-box pairs from open-world unknown objects to train the detector through contrastive learning. The OV-DQUO model achieves new state-of-the-art results on the challenging OV-COCO and OV-LVIS benchmarks, with an AP50 of 45.6 and mAP of 39.3 on novel categories respectively, without requiring additional training data.
Low GrooveSquid.com (original content) Low Difficulty Summary
This paper is about a computer program that can recognize new objects it has never seen before. The problem is that current programs are good at recognizing objects they were trained on, but not so good at recognizing new ones. To solve this, the authors created a new program called OV-DQUO that uses special techniques to learn from unknown objects and text descriptions. This allows the program to be more accurate when recognizing new objects. The authors tested their program on two challenging datasets and achieved better results than previous programs without needing any additional training.

Keywords

» Artificial intelligence  » Semantics