Summary of Ov-dquo: Open-vocabulary Detr with Denoising Text Query Training and Open-world Unknown Objects Supervision, by Junjie Wang et al.
OV-DQUO: Open-Vocabulary DETR with Denoising Text Query Training and Open-World Unknown Objects Supervision
by Junjie Wang, Bin Chen, Bin Kang, Yulin Li, YiChi Chen, Weizhi Xian, Huifeng Chang, Yong Xu
First submitted to arxiv on: 28 May 2024
Categories
- Main: Computer Vision and Pattern Recognition (cs.CV)
- Secondary: Artificial Intelligence (cs.AI)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary The paper introduces OV-DQUO, an open-vocabulary detector that addresses the issue of confidence bias in existing detectors trained on base categories. The proposed model uses wildcard matching to learn from unknown objects and text embeddings with general semantics, reducing the bias between base and novel categories. A denoising text query training strategy is also introduced, which synthesizes foreground and background query-box pairs from open-world unknown objects to train the detector through contrastive learning. The OV-DQUO model achieves new state-of-the-art results on the challenging OV-COCO and OV-LVIS benchmarks, with an AP50 of 45.6 and mAP of 39.3 on novel categories respectively, without requiring additional training data. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary This paper is about a computer program that can recognize new objects it has never seen before. The problem is that current programs are good at recognizing objects they were trained on, but not so good at recognizing new ones. To solve this, the authors created a new program called OV-DQUO that uses special techniques to learn from unknown objects and text descriptions. This allows the program to be more accurate when recognizing new objects. The authors tested their program on two challenging datasets and achieved better results than previous programs without needing any additional training. |
Keywords
» Artificial intelligence » Semantics