Loading Now

Summary of Hm3d-ovon: a Dataset and Benchmark For Open-vocabulary Object Goal Navigation, by Naoki Yokoyama et al.


HM3D-OVON: A Dataset and Benchmark for Open-Vocabulary Object Goal Navigation

by Naoki Yokoyama, Ram Ramrakhya, Abhishek Das, Dhruv Batra, Sehoon Ha

First submitted to arxiv on: 22 Sep 2024

Categories

  • Main: Artificial Intelligence (cs.AI)
  • Secondary: Robotics (cs.RO)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
The proposed Habitat-Matterport 3D Open Vocabulary Object Goal Navigation dataset (HM3D-OVON) is a large-scale benchmark that expands the scope and semantic range of previous Object Goal Navigation (ObjectNav) benchmarks. It incorporates over 15,000 annotated instances of household objects across 379 categories derived from photo-realistic 3D scans of real-world environments. Unlike earlier datasets, HM3D-OVON enables training and evaluation of models with an open-set of goals defined through free-form language at test-time. This open-vocabulary formulation encourages the development of visuo-semantic navigation behaviors that can search for any object specified by text in an open-vocabulary manner. The authors evaluate and compare various approaches on HM3D-OVON, finding that it can be used to train an open-vocabulary ObjectNav agent with higher performance and robustness to localization and actuation noise than the state-of-the-art approach. This benchmark aims to drive interest in developing embodied agents for real-world spaces.
Low GrooveSquid.com (original content) Low Difficulty Summary
The paper presents a big new dataset called HM3D-OVON that helps computers learn how to find objects in homes using text descriptions. Currently, computers can only find certain types of objects, but with this new dataset, they can search for any object described by text. This is important because it could help robots and other machines become more useful in our daily lives.

Keywords

» Artificial intelligence